The partition index is never updated, as sstables are immutable.

On Tue, Mar 21, 2017 at 9:40 AM preetika tyagi <preetikaty...@gmail.com>
wrote:

> Thank you Jan & Jeff for the responses. That was really useful.
>
> Jan - I have one follow-up question. When the data is spread over more
> than one SSTable in case of updates as you mentioned, we will need two
> seeks per SSTable (one for partition index and another for SSTable itself).
> I'm curious to know how partition index is structured internally. I was
> assuming it to be a table with <key, disk offset> pairs. In case of an
> update to the same key for several times, how it is recorded in the
> partition index?
>
> Thanks,
> Preetika
>
> On Mon, Mar 20, 2017 at 10:37 PM, <j.kes...@enercast.de> wrote:
>
> Hi,
>
>
>
> youre right – one seek with hit in the partition key cache and two if not.
>
>
>
> Thats the theory – but two thinge to mention:
>
>
>
> First, you need two seeks per sstable not per entire read. So if you data
> is spread over multiple sstables on disk you obviously need more then two
> reads. Think of often updated partition keys – in combination with memory
> preassure you can easily end up with maaany sstables (ok they will be
> compacted some time in the future).
>
>
>
> Second, there could be fragmentation on disk which leads to seeks during
> sequential reads.
>
>
>
> Jan
>
>
>
> Gesendet von meinem Windows 10 Phone
>
>
>
> *Von: *preetika tyagi <preetikaty...@gmail.com>
> *Gesendet: *Montag, 20. März 2017 21:18
> *An: *user@cassandra.apache.org
> *Betreff: *question on maximum disk seeks
>
>
>
> I'm trying to understand the maximum number of disk seeks required in a
> read operation in Cassandra. I looked at several online articles including
> this one:
> https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutReads.html
>
> As per my understanding, two disk seeks are required in the worst case.
> One is for reading the partition index and another is to read the actual
> data from the compressed partition. The index of the data in compressed
> partitions is obtained from the compression offset tables (which is stored
> in memory). Am I on the right track here? Will there ever be a case when
> more than 1 disk seek is required to read the data?
>
> Thanks,
>
> Preetika
>
>
>
>
>
>

Reply via email to