pycassa failures in large batch cycling

John R. Frank Thu, 09 May 2013 18:09:32 -0700

C* users,

We have a process that loads a large batch of rows from Cassandra intomany separate compute workers. The rows are one-column wide and range insize for a couple KB to ~100 MB. After manipulating the data for a while,each compute worker writes the data back with *new* row keys computed bythe workers (UUIDs).

After the full batch is written back to new rows, a cleanup worker deletesthe old rows.


After several cycles, pycassa starts getting connection failures.

Should we use a pycassa listener to catch these failures and just recreatethe ConnectionPool and keep going as if the connection had not dropped?Or is there a better approach?

These failures happen on just a simple single-node setup with a total dataset less than half the size of Java heap space, e.g. 2GB data (times twofor the two copies during cycling) versus 8GB heap. We tried reducingmemtable_flush_queue_size to 2 so that it would flush the deletes faster,and also tried multithreaded_compaction=true, but still pycassa getsconnection failures.


Is this expected before for shedding load?  Or is this unexpected?

Would things be any different if we used multiple nodes and scaled thedata and worker count to match? I mean, is there something inherent tocassandra's operating model that makes it want to always have multiplenodes?


Thanks for pointers,
John

pycassa failures in large batch cycling

Reply via email to