[
https://issues.apache.org/jira/browse/CASSANDRA-10730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036409#comment-15036409
]
Ariel Weisberg commented on CASSANDRA-10730:
--------------------------------------------
I am not so concerned at this point about the maximum heap size. The free
number for the old generation looks odd doesn't it? I wonder if we are looking
at a corrupt JVM? We could also try switching to the parallel collector and see
if that produces a different/no/better error.
Here is the output for my local eclipse instance.
{code}
Ariels-MBP:java aweisberg$ jmap -heap 250
Attaching to process ID 250, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.60-b23
using thread-local object allocation.
Parallel GC with 8 thread(s)
Heap Configuration:
MinHeapFreeRatio = 0
MaxHeapFreeRatio = 100
MaxHeapSize = 1073741824 (1024.0MB)
NewSize = 89128960 (85.0MB)
MaxNewSize = 357564416 (341.0MB)
OldSize = 179306496 (171.0MB)
NewRatio = 2
SurvivorRatio = 8
MetaspaceSize = 21807104 (20.796875MB)
CompressedClassSpaceSize = 1073741824 (1024.0MB)
MaxMetaspaceSize = 17592186044415 MB
G1HeapRegionSize = 0 (0.0MB)
Heap Usage:
PS Young Generation
Eden Space:
capacity = 225443840 (215.0MB)
used = 3482008 (3.3207015991210938MB)
free = 221961832 (211.6792984008789MB)
1.5445123716842297% used
>From Space:
capacity = 11534336 (11.0MB)
used = 0 (0.0MB)
free = 11534336 (11.0MB)
0.0% used
To Space:
capacity = 12582912 (12.0MB)
used = 0 (0.0MB)
free = 12582912 (12.0MB)
0.0% used
PS Old Generation
capacity = 613941248 (585.5MB)
used = 168407608 (160.60601043701172MB)
free = 445533640 (424.8939895629883MB)
27.430573943127534% used
42936 interned Strings occupying 4320240 bytes.
{code}
Here is the output after I switched to CMS
{code}
Ariels-MBP:Eclipse aweisberg$ jmap -heap 7220
Attaching to process ID 7220, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.60-b23
using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 4294967296 (4096.0MB)
NewSize = 697892864 (665.5625MB)
MaxNewSize = 697892864 (665.5625MB)
OldSize = 375848960 (358.4375MB)
NewRatio = 2
SurvivorRatio = 8
MetaspaceSize = 21807104 (20.796875MB)
CompressedClassSpaceSize = 1073741824 (1024.0MB)
MaxMetaspaceSize = 17592186044415 MB
G1HeapRegionSize = 0 (0.0MB)
Heap Usage:
New Generation (Eden + 1 Survivor Space):
capacity = 628162560 (599.0625MB)
used = 476857008 (454.7662811279297MB)
free = 151305552 (144.2962188720703MB)
75.91299424149061% used
Eden Space:
capacity = 558432256 (532.5625MB)
used = 407126712 (388.2662887573242MB)
free = 151305544 (144.29621124267578MB)
72.90530008352526% used
>From Space:
capacity = 69730304 (66.5MB)
used = 69730296 (66.49999237060547MB)
free = 8 (7.62939453125E-6MB)
99.99998852722626% used
To Space:
capacity = 69730304 (66.5MB)
used = 0 (0.0MB)
free = 69730304 (66.5MB)
0.0% used
concurrent mark-sweep generation:
capacity = 375848960 (358.4375MB)
used = 22865096 (21.80585479736328MB)
free = 352983864 (336.6316452026367MB)
6.0835863427691805% used
47785 interned Strings occupying 4807056 bytes.
{code}
Here it is with G1 GC
{code}
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 4294967296 (4096.0MB)
NewSize = 1363144 (1.2999954223632812MB)
MaxNewSize = 2576351232 (2457.0MB)
OldSize = 5452592 (5.1999969482421875MB)
NewRatio = 2
SurvivorRatio = 8
MetaspaceSize = 21807104 (20.796875MB)
CompressedClassSpaceSize = 1073741824 (1024.0MB)
MaxMetaspaceSize = 17592186044415 MB
G1HeapRegionSize = 1048576 (1.0MB)
Heap Usage:
G1 Heap:
regions = 4096
capacity = 4294967296 (4096.0MB)
used = 186122248 (177.50000762939453MB)
free = 4108845048 (3918.4999923706055MB)
4.333496280014515% used
G1 Young Generation:
Eden Space:
regions = 68
capacity = 328204288 (313.0MB)
used = 71303168 (68.0MB)
free = 256901120 (245.0MB)
21.72523961661342% used
Survivor Space:
regions = 74
capacity = 77594624 (74.0MB)
used = 77594624 (74.0MB)
free = 0 (0.0MB)
100.0% used
G1 Old Generation:
regions = 37
capacity = 667942912 (637.0MB)
used = 36175880 (34.50000762939453MB)
free = 631767032 (602.4999923706055MB)
5.41601375657685% used
47715 interned Strings occupying 4797216 bytes.
{code}
> periodic timeout errors in dtest
> --------------------------------
>
> Key: CASSANDRA-10730
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10730
> Project: Cassandra
> Issue Type: Bug
> Reporter: Jim Witschey
> Assignee: Jim Witschey
>
> Dtests often fail with connection timeout errors. For example:
> http://cassci.datastax.com/job/cassandra-3.1_dtest/lastCompletedBuild/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3/deletion_test/
> {code}
> ('Unable to connect to any servers', {'127.0.0.1':
> OperationTimedOut('errors=Timed out creating connection (10 seconds),
> last_host=None',)})
> {code}
> We've merged a PR to increase timeouts:
> https://github.com/riptano/cassandra-dtest/pull/663
> It doesn't look like this has improved things:
> http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest/363/testReport/
> Next steps here are
> * to scrape Jenkins history to see if and how the number of tests failing
> this way has increased (it feels like it has). From there we can bisect over
> the dtests, ccm, or C*, depending on what looks like the source of the
> problem.
> * to better instrument the dtest/ccm/C* startup process to see why the nodes
> start but don't successfully make the CQL port available.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)