[
https://issues.apache.org/jira/browse/CASSANDRA-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13442538#comment-13442538
]
Tyler Hobbs commented on CASSANDRA-4573:
----------------------------------------
Vijay, I'm actually not seeing very long garbage collections, if I'm reading
the logs correctly. These are the relevant logs, running with a heap of 2GB
and young gen size of 400MB:
{noformat}
{Heap before GC invocations=0 (full 0):
par new generation total 368640K, used 327680K [0x2f200000, 0x48200000,
0x48200000)
eden space 327680K, 100% used [0x2f200000, 0x43200000, 0x43200000)
from space 40960K, 0% used [0x43200000, 0x43200000, 0x45a00000)
to space 40960K, 0% used [0x45a00000, 0x45a00000, 0x48200000)
concurrent mark-sweep generation total 1687552K, used 0K [0x48200000,
0xaf200000, 0xaf200000)
concurrent-mark-sweep perm gen total 16384K, used 14333K [0xaf200000,
0xb0200000, 0xb3200000)
2012-08-27T12:03:56.096-0500: [GC Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 432013312
Max Chunk Size: 432013312
Number of Blocks: 1
Av. Block Size: 432013312
Tree Height: 1
Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 0
Max Chunk Size: 0
Number of Blocks: 0
Tree Height: 0
[ParNew
Desired survivor size 20971520 bytes, new threshold 1 (max 1)
- age 1: 2692712 bytes, 2692712 total
: 327680K->2642K(368640K), 0.0564410 secs] 327680K->2642K(2056192K)After GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 431996928
Max Chunk Size: 431996928
Number of Blocks: 1
Av. Block Size: 431996928
Tree Height: 1
After GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 0
Max Chunk Size: 0
Number of Blocks: 0
Tree Height: 0
, 0.0567720 secs] [Times: user=0.03 sys=0.00, real=0.06 secs]
Heap after GC invocations=1 (full 0):
par new generation total 368640K, used 2642K [0x2f200000, 0x48200000,
0x48200000)
eden space 327680K, 0% used [0x2f200000, 0x2f200000, 0x43200000)
from space 40960K, 6% used [0x45a00000, 0x45c94998, 0x48200000)
to space 40960K, 0% used [0x43200000, 0x43200000, 0x45a00000)
concurrent mark-sweep generation total 1687552K, used 0K [0x48200000,
0xaf200000, 0xaf200000)
concurrent-mark-sweep perm gen total 16384K, used 14333K [0xaf200000,
0xb0200000, 0xb3200000)
}
Total time for which application threads were stopped: 0.0576140 seconds
Total time for which application threads were stopped: 0.0080490 seconds
Total time for which application threads were stopped: 0.0000810 seconds
Total time for which application threads were stopped: 0.0000410 seconds
Total time for which application threads were stopped: 0.0000360 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000360 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000320 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000370 seconds
Total time for which application threads were stopped: 0.0000360 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000360 seconds
Total time for which application threads were stopped: 0.0000320 seconds
Total time for which application threads were stopped: 0.0000340 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000320 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000760 seconds
Total time for which application threads were stopped: 0.0000490 seconds
Total time for which application threads were stopped: 0.0000330 seconds
Total time for which application threads were stopped: 0.0000370 seconds
Total time for which application threads were stopped: 0.0000460 seconds
Total time for which application threads were stopped: 0.0000350 seconds
Total time for which application threads were stopped: 0.0004150 seconds
Total time for which application threads were stopped: 0.0001230 seconds
Total time for which application threads were stopped: 0.0035150 seconds
{noformat}
The client-side socket timeout is set to 3 seconds, so it's not hitting that
timeout due to garbage collections. I should also note that the client-side
error is different when there is a client socket timeout (something like
{{TTransportException: timed out reading 4 bytes}}).
> HSHA doesn't handle large messages gracefully
> ---------------------------------------------
>
> Key: CASSANDRA-4573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4573
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Tyler Hobbs
> Assignee: Vijay
> Attachments: repro.py
>
>
> HSHA doesn't seem to enforce any kind of max message length, and when
> messages are too large, it doesn't fail gracefully.
> With debug logs enabled, you'll see this:
> {{DEBUG 13:13:31,805 Unexpected state 16}}
> Which seems to mean that there's a SelectionKey that's valid, but isn't ready
> for reading, writing, or accepting.
> Client-side, you'll get this thrift error (while trying to read a frame as
> part of {{recv_batch_mutate}}):
> {{TTransportException: TSocket read 0 bytes}}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira