RE: Finding big rows

Meler Wojciech Wed, 11 May 2011 01:19:14 -0700

Thanks for reply. My app uses 7-bit ascii string row keys so I assume that they 
could be directly used.


I'd like to fetch whole row. I was able to dump the big row with sstable2json, 
but both my app and cli is unable to read the row from cassandra.
I see in json dump that all columns are marked as "deletedAt": 
-9223372036854775808, so SuperColumn::isMarkedForDelete() should return false. 
My cluster is running cassandra 0.7.4 and it path was 
0.7.0->0.7.2->0.7.3->0.7.4.
What's wrong? Bloom filters seems to be OK - I couldn't find tool for reading 
them but attached program does the job.
I'm sure that both my app and cli refer to proper keys this big rows is getting 
bigger and bigger as my app appends new super- and sub-columns to it, but can't 
read it:
get mycf[utf8('my-key')];
Returned 0 results.
I'm really confused - tried to turn debug on, but I can't see anything 
interesting in it. Any ideas what to check next?


Regards,
Wojtek

From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Wednesday, May 11, 2011 12:29 AM
To: user@cassandra.apache.org
Subject: Re: Finding big rows

I'm not aware of anything to find the row sizes, and your code looks like a 
good approach. Converting the key bytes to a string only makes sense if your 
app is doing the same thing.

In the cli try using one of the data type functions to format the key the same 
way as your app is, e.g. get FooCF[utf8('my-key')]

The main limitation on Super Columns is that Sub columns are not indexed 
http://wiki.apache.org/cassandra/CassandraLimitations. If you have a huge row 
use the get_slice() api call to get back slices of columns. The cli does not 
support slicing columns.

Hope that helps.
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 10 May 2011, at 20:41, Meler Wojciech wrote:


Hello,

I've noticed very nice stats exposed with JMX. I was quite shocked when I saw 
that MaxRowSize was about 400MB (it was expected to be several MB).
What is the best way to find keys of such big rows?

I couldn't find anything so I've written simple program to dump sizes from 
Index files (see attachment),
and got the keys, but when I used cassandra-cli to get such columns it said 
"Returned 0 results.".
I've realised that my app creates such big rows because it can't read them from 
Cassandra and recreates them every time.

Are there any tuneable limits for getting whole row?  Any limits on 
supercolumns?

Regards,
Wojtek


"WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 
C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow 
prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS 
0000068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci 
oraz Numerze Identyfikacji Podatkowej 957-07-51-216.
<IdxDump.java>




"WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 
C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow 
prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS 
0000068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci 
oraz Numerze Identyfikacji Podatkowej 957-07-51-216.

BFCheck.java
Description: BFCheck.java

RE: Finding big rows

Reply via email to