Re: Range Slices timing question

Jonathan Ellis Thu, 10 Jun 2010 16:21:05 -0700

get_range_slices is faster in 0.7 but there's not much you can do in 0.6.

On Wed, Jun 9, 2010 at 11:04 AM, Carlos Sanchez
<[email protected]> wrote:
> I have about a million rows (each row with 100 cols) of the form 
> domain/!date/!id  (e.g. gwm.com/!20100430/!CFRA4500) So I am interested in 
> getting all the ids (all cols) for a particular domain/date (e.g. 
> "gwm.ml.com/!20100430/!A" "gwm.ml.com/!20100430/!D"). I am looping in chunks 
> of 6000 rows / 500 cols at a time. However, it is taken in my 5 node cluster 
> (each  machine has 32gb in ram, RF=3 and OPP, v0.6.1) 36 secs to get all the 
> required rows (stats below); which I think it is a bit high. I am wondering 
> if a possible cause it's the way my string keys are constructed (suggestions 
> are welcome) that makes Cassandra work 'harder' when doing a 'range slices'. 
> Does Cassandra examines all row keys to search for matches? Are there any 
> settings I can tweak to try to make the retrieval faster?
>
> Thanks
>
> Carlos
>
> row(s) found 6000 in 35086ms
> total cols(s) found 593502
> row bytes 228000
> col bytes 38422670
> total bytes 38650670  (36.86015 MB)
>
>
>
>
> This email message and any attachments are for the sole use of the intended 
> recipients and may contain proprietary and/or confidential information which 
> may be privileged or otherwise protected from disclosure. Any unauthorized 
> review, use, disclosure or distribution is prohibited. If you are not an 
> intended recipient, please contact the sender by reply email and destroy the 
> original message and any copies of the message as well as any attachments to 
> the original message.
>




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Range Slices timing question

Reply via email to