AFAIk this is still roughly correct 
http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/

It includes information on the page size read from disk. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 22/02/2013, at 5:45 AM, Jouni Hartikainen <jouni.hartikai...@reaktor.fi> 
wrote:

> 
> Hi,
> 
> On Feb 21, 2013, at 7:52 , Kanwar Sangha <kan...@mavenir.com> wrote:
>> Hi – Can someone explain the worst case IOPS for a read ? No key cache, No 
>> row cache, sampling rate say 512.
>> 
>> 1)      Bloom filter will be checked to see existence of key (In RAM)
>> 2)      Index filer sample (IN RAM) will be checked to find approx. location 
>> in index file on disk
>> 3)      1 IOPS to read the actual index file on disk (DISK)
>> 4)      1 IOPS to get the data from the location in the sstable (DISK)
>> 
>> Is this correct ?
> 
> As you were asking for the worst case, I would still add one step that would 
> be a seek inside an SSTable from the row start to the queried columns using 
> column index.
> 
> However, this applies only if you are querying a subset of columns in the row 
> (not all) and the total row size exceeds column_index_size_in_kb (defaults to 
> 64kB).
> 
> So, as far as I have understood, the worst case steps (without any caches) 
> are:
> 
> 1. Check the SSTable bloom filters (in memory)
> 2. Use index samples to find approx. correct place in the key index file (in 
> memory)
> 3. Read the key index file until correct key is found (1st disk seek & read)
> 5. Seek to the start of the row in SSTable file and read row headers 
> (possibly including column index) (2nd seek & read)
> 6. Using column index seek to the correct place inside the SSTable file to 
> actually read the columns (3rd seek & read)
> 
> If the row is very wide and you are asking for a random bunch of columns from 
> here and there, the step 6 might even be needed multiple times. Also, if your 
> row has spread over many SSTables, each of them needs to be accessed (at 
> least steps 1. - 5.) to get the complete results for the query.
> 
> All this in mind, if your node has any reasonable amount of reads, I'd say 
> that in practice key index files will be page cached by the OS very quickly 
> and thus normal read would end up being either one seek (for small rows 
> without the column index) or two (for wider rows). Of course, as Peter 
> already pointed out, the more columns you ask for, the more disk needs to 
> read. For a continuous set of columns the read should be linear, however.
> 
> -Jouni

Reply via email to