[
https://issues.apache.org/jira/browse/CASSANDRA-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203431#comment-13203431
]
Sylvain Lebresne commented on CASSANDRA-3861:
---------------------------------------------
bq. In your example above, the "right" thing to do from a client's perspective
is to use a limit of 10000.
Agreed, but my argument is that if 99% of query returns < 10 rows, our code is
uselessly inefficient for 99% of the queries. I'm really only talking about a
performance issue.
bq. I guess I'd be okay with dropping that if we add a special check to return
IRE for the MAX_VALUE antipattern.
I think that forbidding the MAX_VALUE anti-pattern is a different debate, but
throwing a IRE on MAX_VALUE would be very java specific. For users of other
languages, the same anti-pattern would likely be to pass some huge number, but
likely not MAX_VALUE exactly. The right solution moving forward will be to do
automatic paging with CQL, but in the meantime I don't see a good way to
protect people against their own mistake that does not incur inefficiency or
limitations.
> get_indexed_slices throws OOM Error when is called with too big
> indexClause.count
> ---------------------------------------------------------------------------------
>
> Key: CASSANDRA-3861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3861
> Project: Cassandra
> Issue Type: Bug
> Components: API, Core
> Affects Versions: 1.0.7
> Reporter: Vladimir Tsanev
> Assignee: Sylvain Lebresne
> Fix For: 1.0.8
>
> Attachments: 3861.patch
>
>
> I tried to call get_index_slices with Integer.MAX_VALUE as IndexClause.count.
> Unfortunately the node died with OOM. In the log there si following error:
> ERROR [Thrift:4] 2012-02-06 17:43:39,224 Cassandra.java (line 3252) Internal
> error processing get_indexed_slices
> java.lang.OutOfMemoryError: Java heap space
> at java.util.ArrayList.<init>(ArrayList.java:112)
> at
> org.apache.cassandra.service.StorageProxy.scan(StorageProxy.java:1067)
> at
> org.apache.cassandra.thrift.CassandraServer.get_indexed_slices(CassandraServer.java:746)
> at
> org.apache.cassandra.thrift.Cassandra$Processor$get_indexed_slices.process(Cassandra.java:3244)
> at
> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
> at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Is it necessary to allocate all the memory in advance. I only have 3 KEYS
> that match my caluse. I do not known the exact number but in general I am
> sure that they wil fit in the memory.
> I can/will implement some calls with paging, but wanted to test and I am not
> happy with the fact the node disconnected.
> I wonder why ArrayList is used here?
> I think the result is never accessed by index (but only iterated) and the
> subList for non RandomAccess Lists (for example LinkedList) will do the same
> job if you are not using other operations than iteration.
> Is this related to the problem described in CASSANDRA-691.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira