get_range_slices is faster in 0.7 but there's not much you can do in 0.6. On Wed, Jun 9, 2010 at 11:04 AM, Carlos Sanchez <[email protected]> wrote: > I have about a million rows (each row with 100 cols) of the form > domain/!date/!id (e.g. gwm.com/!20100430/!CFRA4500) So I am interested in > getting all the ids (all cols) for a particular domain/date (e.g. > "gwm.ml.com/!20100430/!A" "gwm.ml.com/!20100430/!D"). I am looping in chunks > of 6000 rows / 500 cols at a time. However, it is taken in my 5 node cluster > (each machine has 32gb in ram, RF=3 and OPP, v0.6.1) 36 secs to get all the > required rows (stats below); which I think it is a bit high. I am wondering > if a possible cause it's the way my string keys are constructed (suggestions > are welcome) that makes Cassandra work 'harder' when doing a 'range slices'. > Does Cassandra examines all row keys to search for matches? Are there any > settings I can tweak to try to make the retrieval faster? > > Thanks > > Carlos > > row(s) found 6000 in 35086ms > total cols(s) found 593502 > row bytes 228000 > col bytes 38422670 > total bytes 38650670 (36.86015 MB) > > > > > This email message and any attachments are for the sole use of the intended > recipients and may contain proprietary and/or confidential information which > may be privileged or otherwise protected from disclosure. Any unauthorized > review, use, disclosure or distribution is prohibited. If you are not an > intended recipient, please contact the sender by reply email and destroy the > original message and any copies of the message as well as any attachments to > the original message. >
-- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
