[jira] [Commented] (CASSANDRA-11528) Server Crash when select returns more than a few hundred rows

Benjamin Lerer (JIRA) Tue, 03 May 2016 03:17:46 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-11528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268471#comment-15268471
 ]


Benjamin Lerer commented on CASSANDRA-11528:
--------------------------------------------

[~matthiasw] the amount of heap used by the select statement is dependent on 
the amount of data that you store in your rows.
Count queries will paginate internally your rows based on the page size that 
you use (the LIMIT has no effect on aggregation queries).
For a java driver the page size is 5000. By consequence, if all of your rows 
contains 4Mb  you will end up with around 19 Gb in memory unless ... your JVM 
exit before.

{{COUNT((*))}} is an expensive operation as it will have to bring back all the 
rows to the coordinator in order to coun them properly (some replica might not 
have the latest data).

I will try to see if it is possible to avoid that type of problem. 

> Server Crash when select returns more than a few hundred rows
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-11528
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11528
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: windows 7, 8 GB machine
>            Reporter: Mattias W
>             Fix For: 3.x
>
>         Attachments: datastax_ddc_server-stdout.2016-04-07.log
>
>
> While implementing a dump procedure, which did "select * from" from one table 
> at a row, I instantly kill the server. A simple 
> {noformat}select count(*) from {noformat} 
> also kills it. For a while, I thought the size of blobs were the cause
> I also try to only have a unique id as partition key, I was afraid a single 
> partition got too big or so, but that didn't change anything
> It happens every time, both from Java/Clojure and from DevCenter.
> I looked at the logs at C:\Program Files\DataStax-DDC\logs, but the crash is 
> so quick, so nothing is recorded there.
> There is a Java-out-of-memory in the logs, but that isn't from the time of 
> the crash.
> It only happens for one table, it only has 15000 entries, but there are blobs 
> and byte[] stored there, size between 100kb - 4Mb. Total size for that table 
> is about 6.5 GB on disk.
> I made a workaround by doing many small selects instead, each only fetching 
> 100 rows.
> Is there a setting a can set to make the system log more eagerly, in order to 
> at least get a stacktrace or similar, that might help you.
> It is the prun_srv that dies. Restarting the NT service makes Cassandra run 
> again



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11528) Server Crash when select returns more than a few hundred rows

Reply via email to