[
https://issues.apache.org/jira/browse/CASSANDRA-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14199134#comment-14199134
]
Michael Shuler commented on CASSANDRA-8259:
-------------------------------------------
{quote}
And honestly, the "building Thrift response" code should be smart enough to
figure out it is about to bring the whole server down.
A simple check on the number of rows and columns returned and their size and
maybe throwing a regular exception would have been a much better option than
crashing the entire server.
{quote}
Would you have a patch to attach that implements this?
> Add column family name when reporting OutOfMemory errors
> --------------------------------------------------------
>
> Key: CASSANDRA-8259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8259
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Jacek Furmankiewicz
>
> When we get a Thrift error like this which causes a server crash:
> {noformat}
> ERROR [Thrift:33] 2014-11-05 17:36:07,486 CassandraDaemon.java (line 196)
> Exception in thread Thread[Thrift:33,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:2271)
> at java.io.ByteArrayOutputStream.grow
> (ByteArrayOutputStream.java:113)
> at java.io.ByteArrayOutputStream.ensureCapacity
> (ByteArrayOutputStream.java:93)
> at java.io.ByteArrayOutputStream.write
> (ByteArrayOutputStream.java:140)
> at org.apache.thrift.transport.TFramedTransport.write
> (TFramedTransport.java:146)
> at org.apache.thrift.protocol.TBinaryProtocol.writeBinary
> (TBinaryProtocol.java:183)
> at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write
> (Column.java:678)
> at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write
> (Column.java:611)
> at org.apache.cassandra.thrift.Column.write(Column.java:538)
> at org.apache.cassandra.thrift.ColumnOrSuperColumn
> $ColumnOrSuperColumnStandardScheme.write(ColumnOrSuperColumn.java:673)
> at org.apache.cassandra.thrift.ColumnOrSuperColumn
> $ColumnOrSuperColumnStandardScheme.write(ColumnOrSuperColumn.java:607)
> at org.apache.cassandra.thrift.ColumnOrSuperColumn.write
> (ColumnOrSuperColumn.java:517)
> at org.apache.cassandra.thrift.Cassandra$multiget_slice_result
> $multiget_slice_resultStandardScheme.write(Cassandra.java:14559)
> at org.apache.cassandra.thrift.Cassandra$multiget_slice_result
> $multiget_slice_resultStandardScheme.write(Cassandra.java:14463)
> at org.apache.cassandra.thrift.Cassandra
> $multiget_slice_result.write(Cassandra.java:14393)
> at org.apache.thrift.ProcessFunction.process
> (ProcessFunction.java:53)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at org.apache.cassandra.thrift.CustomTThreadPoolServer
> $WorkerProcess.run(CustomTThreadPoolServer.java:194)
> at java.util.concurrent.ThreadPoolExecutor.runWorker
> (ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run
> (ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> INFO [StorageServiceShutdownHook] 2014-11-05 17:36:07,488
> ThriftServer.java (line 141) Stop listening to thrift clients
> {noformat}
> we have no clue as to which column family was being queried. That makes it
> extremely difficult to troubleshoot which query in a complex code base caused
> this error.
> We have multiple servers and they all throw a NoAvailableHostException in
> Astyanax at the same time, all in different parts of the code...so figuring
> out the root cause is an exercise in frustration that takes many hours.
> At least listing the column family in this message would save us COUNTLESS
> hours of troubleshooting.
> We're on 2.0.8, JDK 1.7, RHEL 6
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)