On the cassandra irc channel I discussed this question. I learned that the
timestamp in the Memtable may be OLDER than the timestamp in some SSTable
(e.g., due to hints or retries). So there’s no guarantee that the Memtable has
the most recent version.
But there may be cases, they say, in which the time stamp in the SSTable can be
used to skip over SSTables that have older data (via metadata on SSTables, I
presume).
Memtable are like write-through caches and do NOT correspond to SSTables loaded
from disk.
From: jonathan.had...@gmail.com [mailto:jonathan.had...@gmail.com] On Behalf Of
Jonathan Haddad
Sent: Wednesday, October 22, 2014 9:24 AM
To: user@cassandra.apache.org
Subject: Re: Is cassandra smart enough to serve Read requests entirely from
Memtables in some cases?
No. Consider a scenario where you supply a timestamp a week in the future,
flush it to sstable, and then do a write, with the current timestamp. The
record in disk will have a timestamp greater than the one in the memtable.
On Wed, Oct 22, 2014 at 9:18 AM, Donald Smith
mailto:donald.sm...@audiencescience.com>>
wrote:
Question about the read path in cassandra. If a partition/row is in the
Memtable and is being actively written to by other clients, will a READ of
that partition also have to hit SStables on disk (or in the page cache)? Or
can it be serviced entirely from the Memtable?
If you select all columns (e.g., “select * from ….”) then I can imagine that
cassandra would need to merge whatever columns are in the Memtable with what’s
in SStables on disk.
But if you select a single column (e.g., “select Name from …. where id= ….”)
and if that column is in the Memtable, I’d hope cassandra could skip checking
the disk. Can it do this optimization?
Thanks, Don
Donald A. Smith | Senior Software Engineer
P: 425.201.3900 x 3866
C: (206) 819-5965
F: (646) 443-2333
dona...@audiencescience.com<mailto:dona...@audiencescience.com>
[AudienceScience]
--
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade