Thanks for reply. My app uses 7-bit ascii string row keys so I assume that they could be directly used.
I'd like to fetch whole row. I was able to dump the big row with sstable2json, but both my app and cli is unable to read the row from cassandra. I see in json dump that all columns are marked as "deletedAt": -9223372036854775808, so SuperColumn::isMarkedForDelete() should return false. My cluster is running cassandra 0.7.4 and it path was 0.7.0->0.7.2->0.7.3->0.7.4. What's wrong? Bloom filters seems to be OK - I couldn't find tool for reading them but attached program does the job. I'm sure that both my app and cli refer to proper keys this big rows is getting bigger and bigger as my app appends new super- and sub-columns to it, but can't read it: get mycf[utf8('my-key')]; Returned 0 results. I'm really confused - tried to turn debug on, but I can't see anything interesting in it. Any ideas what to check next? Regards, Wojtek From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Wednesday, May 11, 2011 12:29 AM To: user@cassandra.apache.org Subject: Re: Finding big rows I'm not aware of anything to find the row sizes, and your code looks like a good approach. Converting the key bytes to a string only makes sense if your app is doing the same thing. In the cli try using one of the data type functions to format the key the same way as your app is, e.g. get FooCF[utf8('my-key')] The main limitation on Super Columns is that Sub columns are not indexed http://wiki.apache.org/cassandra/CassandraLimitations. If you have a huge row use the get_slice() api call to get back slices of columns. The cli does not support slicing columns. Hope that helps. ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 10 May 2011, at 20:41, Meler Wojciech wrote: Hello, I've noticed very nice stats exposed with JMX. I was quite shocked when I saw that MaxRowSize was about 400MB (it was expected to be several MB). What is the best way to find keys of such big rows? I couldn't find anything so I've written simple program to dump sizes from Index files (see attachment), and got the keys, but when I used cassandra-cli to get such columns it said "Returned 0 results.". I've realised that my app creates such big rows because it can't read them from Cassandra and recreates them every time. Are there any tuneable limits for getting whole row? Any limits on supercolumns? Regards, Wojtek "WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS 0000068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci oraz Numerze Identyfikacji Podatkowej 957-07-51-216. <IdxDump.java> "WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS 0000068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci oraz Numerze Identyfikacji Podatkowej 957-07-51-216.
BFCheck.java
Description: BFCheck.java