[ 
https://issues.apache.org/jira/browse/CASSANDRA-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-8259:
--------------------------------------
    Description: 
When we get a Thrift error like this which causes a server crash:
{noformat}
ERROR [Thrift:33] 2014-11-05 17:36:07,486 CassandraDaemon.java (line 196)
Exception in thread Thread[Thrift:33,5,main]
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2271)
        at java.io.ByteArrayOutputStream.grow
(ByteArrayOutputStream.java:113)
        at java.io.ByteArrayOutputStream.ensureCapacity
(ByteArrayOutputStream.java:93)
        at java.io.ByteArrayOutputStream.write
(ByteArrayOutputStream.java:140)
        at org.apache.thrift.transport.TFramedTransport.write
(TFramedTransport.java:146)
        at org.apache.thrift.protocol.TBinaryProtocol.writeBinary
(TBinaryProtocol.java:183)
        at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write
(Column.java:678)
        at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write
(Column.java:611)
        at org.apache.cassandra.thrift.Column.write(Column.java:538)
        at org.apache.cassandra.thrift.ColumnOrSuperColumn
$ColumnOrSuperColumnStandardScheme.write(ColumnOrSuperColumn.java:673)
        at org.apache.cassandra.thrift.ColumnOrSuperColumn
$ColumnOrSuperColumnStandardScheme.write(ColumnOrSuperColumn.java:607)
        at org.apache.cassandra.thrift.ColumnOrSuperColumn.write
(ColumnOrSuperColumn.java:517)
        at org.apache.cassandra.thrift.Cassandra$multiget_slice_result
$multiget_slice_resultStandardScheme.write(Cassandra.java:14559)
        at org.apache.cassandra.thrift.Cassandra$multiget_slice_result
$multiget_slice_resultStandardScheme.write(Cassandra.java:14463)
        at org.apache.cassandra.thrift.Cassandra
$multiget_slice_result.write(Cassandra.java:14393)
        at org.apache.thrift.ProcessFunction.process
(ProcessFunction.java:53)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.cassandra.thrift.CustomTThreadPoolServer
$WorkerProcess.run(CustomTThreadPoolServer.java:194)
        at java.util.concurrent.ThreadPoolExecutor.runWorker
(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
 INFO [StorageServiceShutdownHook] 2014-11-05 17:36:07,488
ThriftServer.java (line 141) Stop listening to thrift clients
{noformat}

we have no clue as to which column family was being queried. That makes it 
extremely difficult to troubleshoot which query in a complex code base caused 
this error.

We have multiple servers and they all throw a NoAvailableHostException in 
Astyanax at the same time, all in different parts of the code...so figuring out 
the root cause is an exercise in frustration that takes many hours.

At least listing the column family in this message would save us COUNTLESS 
hours of troubleshooting.

We're on 2.0.8, JDK 1.7, RHEL 6

  was:
When we get a Thrift error like this which causes a server crash:
```
ERROR [Thrift:33] 2014-11-05 17:36:07,486 CassandraDaemon.java (line 196)
Exception in thread Thread[Thrift:33,5,main]
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2271)
        at java.io.ByteArrayOutputStream.grow
(ByteArrayOutputStream.java:113)
        at java.io.ByteArrayOutputStream.ensureCapacity
(ByteArrayOutputStream.java:93)
        at java.io.ByteArrayOutputStream.write
(ByteArrayOutputStream.java:140)
        at org.apache.thrift.transport.TFramedTransport.write
(TFramedTransport.java:146)
        at org.apache.thrift.protocol.TBinaryProtocol.writeBinary
(TBinaryProtocol.java:183)
        at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write
(Column.java:678)
        at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write
(Column.java:611)
        at org.apache.cassandra.thrift.Column.write(Column.java:538)
        at org.apache.cassandra.thrift.ColumnOrSuperColumn
$ColumnOrSuperColumnStandardScheme.write(ColumnOrSuperColumn.java:673)
        at org.apache.cassandra.thrift.ColumnOrSuperColumn
$ColumnOrSuperColumnStandardScheme.write(ColumnOrSuperColumn.java:607)
        at org.apache.cassandra.thrift.ColumnOrSuperColumn.write
(ColumnOrSuperColumn.java:517)
        at org.apache.cassandra.thrift.Cassandra$multiget_slice_result
$multiget_slice_resultStandardScheme.write(Cassandra.java:14559)
        at org.apache.cassandra.thrift.Cassandra$multiget_slice_result
$multiget_slice_resultStandardScheme.write(Cassandra.java:14463)
        at org.apache.cassandra.thrift.Cassandra
$multiget_slice_result.write(Cassandra.java:14393)
        at org.apache.thrift.ProcessFunction.process
(ProcessFunction.java:53)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.cassandra.thrift.CustomTThreadPoolServer
$WorkerProcess.run(CustomTThreadPoolServer.java:194)
        at java.util.concurrent.ThreadPoolExecutor.runWorker
(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
 INFO [StorageServiceShutdownHook] 2014-11-05 17:36:07,488
ThriftServer.java (line 141) Stop listening to thrift clients
```

we have no clue as to which column family was being queried. That makes it 
extremely difficult to troubleshoot which query in a complex code base caused 
this error.

We have multiple servers and they all throw a NoAvailableHostException in 
Astyanax at the same time, all in different parts of the code...so figuring out 
the root cause is an exercise in frustration that takes many hours.

At least listing the colum family in this message would save us COUNTLESS hours 
of troubleshooting.

We're on 2.0.8, JDK 1.7, RHEL 6


> Add column family name when reporting OutOfMemory errors
> --------------------------------------------------------
>
>                 Key: CASSANDRA-8259
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8259
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jacek Furmankiewicz
>            Priority: Critical
>
> When we get a Thrift error like this which causes a server crash:
> {noformat}
> ERROR [Thrift:33] 2014-11-05 17:36:07,486 CassandraDaemon.java (line 196)
> Exception in thread Thread[Thrift:33,5,main]
> java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOf(Arrays.java:2271)
>         at java.io.ByteArrayOutputStream.grow
> (ByteArrayOutputStream.java:113)
>         at java.io.ByteArrayOutputStream.ensureCapacity
> (ByteArrayOutputStream.java:93)
>         at java.io.ByteArrayOutputStream.write
> (ByteArrayOutputStream.java:140)
>         at org.apache.thrift.transport.TFramedTransport.write
> (TFramedTransport.java:146)
>         at org.apache.thrift.protocol.TBinaryProtocol.writeBinary
> (TBinaryProtocol.java:183)
>         at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write
> (Column.java:678)
>         at org.apache.cassandra.thrift.Column$ColumnStandardScheme.write
> (Column.java:611)
>         at org.apache.cassandra.thrift.Column.write(Column.java:538)
>         at org.apache.cassandra.thrift.ColumnOrSuperColumn
> $ColumnOrSuperColumnStandardScheme.write(ColumnOrSuperColumn.java:673)
>         at org.apache.cassandra.thrift.ColumnOrSuperColumn
> $ColumnOrSuperColumnStandardScheme.write(ColumnOrSuperColumn.java:607)
>         at org.apache.cassandra.thrift.ColumnOrSuperColumn.write
> (ColumnOrSuperColumn.java:517)
>         at org.apache.cassandra.thrift.Cassandra$multiget_slice_result
> $multiget_slice_resultStandardScheme.write(Cassandra.java:14559)
>         at org.apache.cassandra.thrift.Cassandra$multiget_slice_result
> $multiget_slice_resultStandardScheme.write(Cassandra.java:14463)
>         at org.apache.cassandra.thrift.Cassandra
> $multiget_slice_result.write(Cassandra.java:14393)
>         at org.apache.thrift.ProcessFunction.process
> (ProcessFunction.java:53)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at org.apache.cassandra.thrift.CustomTThreadPoolServer
> $WorkerProcess.run(CustomTThreadPoolServer.java:194)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker
> (ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run
> (ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
>  INFO [StorageServiceShutdownHook] 2014-11-05 17:36:07,488
> ThriftServer.java (line 141) Stop listening to thrift clients
> {noformat}
> we have no clue as to which column family was being queried. That makes it 
> extremely difficult to troubleshoot which query in a complex code base caused 
> this error.
> We have multiple servers and they all throw a NoAvailableHostException in 
> Astyanax at the same time, all in different parts of the code...so figuring 
> out the root cause is an exercise in frustration that takes many hours.
> At least listing the column family in this message would save us COUNTLESS 
> hours of troubleshooting.
> We're on 2.0.8, JDK 1.7, RHEL 6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to