[
https://issues.apache.org/jira/browse/CASSANDRA-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946624#comment-14946624
]
Robbie Strickland commented on CASSANDRA-10449:
-----------------------------------------------
I increased max heap to 96GB and tried again. Now doing netstats shows
progress ground to a halt:
9pm:
{noformat}
ubuntu@eventcass4x024:~$ nodetool netstats | grep -v 100%
Mode: JOINING
Bootstrap 45d8dec0-6c12-11e5-90ef-f7a8e02e59c0
/52.1.155.147 (using /10.239.209.15)
Receiving 139 files, 36548040412 bytes total. Already received 139
files, 36548040412 bytes total
/52.2.9.34 (using /10.239.209.17)
Receiving 171 files, 60000431853 bytes total. Already received 171
files, 60000431853 bytes total
/52.0.152.88 (using /10.239.209.44)
Receiving 147 files, 78458709168 bytes total. Already received 79
files, 55003961646 bytes total
/var/lib/cassandra/xvdd/data/prod_analytics_events/wuevents-ffa99ad05af911e596f05987bbaaffad/prod_analytics_events-wuevents-tmp-ka-295-Data.db
955162267/4105438496 bytes(23%) received from idx:0/52.0.152.88
/52.2.0.164 (using /10.239.209.16)
Receiving 141 files, 36700837768 bytes total. Already received 141
files, 36700837768 bytes total
/54.152.177.161 (using /10.239.209.93)
/54.172.174.48 (using /10.239.209.49)
Receiving 176 files, 79676288976 bytes total. Already received 98
files, 55932809644 bytes total
/var/lib/cassandra/xvdb/data/prod_analytics_events/wuevents-ffa99ad05af911e596f05987bbaaffad/prod_analytics_events-wuevents-tmp-ka-329-Data.db
174070078/7326235809 bytes(2%) received from idx:0/54.172.174.48
/52.2.75.82 (using /10.239.208.88)
/54.165.111.69 (using /10.239.209.47)
Receiving 170 files, 85920995638 bytes total. Already received 94
files, 54985226700 bytes total
/var/lib/cassandra/xvdd/data/prod_analytics_events/wuevents-ffa99ad05af911e596f05987bbaaffad/prod_analytics_events-wuevents-tmp-ka-265-Data.db
4875660361/22821083384 bytes(21%) received from idx:0/54.165.111.69
/52.6.136.30 (using /10.239.209.45)
Receiving 174 files, 87064163973 bytes total. Already received 91
files, 53930233899 bytes total
/var/lib/cassandra/xvdb/data/prod_analytics_events/wuevents-ffa99ad05af911e596f05987bbaaffad/prod_analytics_events-wuevents-tmp-ka-157-Data.db
17064156850/25823860172 bytes(66%) received from idx:0/52.6.136.30
/52.7.14.201 (using /10.239.209.46)
Receiving 164 files, 46351636573 bytes total. Already received 164
files, 46351636573 bytes total
/52.2.30.66 (using /10.239.209.18)
Receiving 158 files, 62899520151 bytes total. Already received 158
files, 62899520151 bytes total
/54.175.138.33 (using /10.239.209.96)
/54.88.44.178 (using /10.239.209.91)
/52.2.109.194 (using /10.239.208.89)
/54.172.81.117 (using /10.239.209.95)
/54.172.103.46 (using /10.239.209.48)
Receiving 164 files, 48771232182 bytes total. Already received 164
files, 48771232182 bytes total
/54.164.172.164 (using /10.239.209.94)
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name Active Pending Completed
Commands n/a 19 56
Responses n/a 0 35515795
{noformat}
6am:
{noformat}
ubuntu@eventcass4x024:~$ nodetool netstats | grep -v 100%
Mode: JOINING
Bootstrap 45d8dec0-6c12-11e5-90ef-f7a8e02e59c0
/52.1.155.147 (using /10.239.209.15)
Receiving 139 files, 36548040412 bytes total. Already received 139
files, 36548040412 bytes total
/52.2.9.34 (using /10.239.209.17)
Receiving 171 files, 60000431853 bytes total. Already received 171
files, 60000431853 bytes total
/52.0.152.88 (using /10.239.209.44)
Receiving 147 files, 78458709168 bytes total. Already received 79
files, 55003961646 bytes total
/var/lib/cassandra/xvdd/data/prod_analytics_events/wuevents-ffa99ad05af911e596f05987bbaaffad/prod_analytics_events-wuevents-tmp-ka-295-Data.db
955162267/4105438496 bytes(23%) received from idx:0/52.0.152.88
/52.2.0.164 (using /10.239.209.16)
Receiving 141 files, 36700837768 bytes total. Already received 141
files, 36700837768 bytes total
/54.152.177.161 (using /10.239.209.93)
/54.172.174.48 (using /10.239.209.49)
Receiving 176 files, 79676288976 bytes total. Already received 98
files, 55932809644 bytes total
/var/lib/cassandra/xvdb/data/prod_analytics_events/wuevents-ffa99ad05af911e596f05987bbaaffad/prod_analytics_events-wuevents-tmp-ka-329-Data.db
174070078/7326235809 bytes(2%) received from idx:0/54.172.174.48
/52.2.75.82 (using /10.239.208.88)
/54.165.111.69 (using /10.239.209.47)
Receiving 170 files, 85920995638 bytes total. Already received 94
files, 54985226700 bytes total
/var/lib/cassandra/xvdd/data/prod_analytics_events/wuevents-ffa99ad05af911e596f05987bbaaffad/prod_analytics_events-wuevents-tmp-ka-265-Data.db
4875660361/22821083384 bytes(21%) received from idx:0/54.165.111.69
/52.6.136.30 (using /10.239.209.45)
Receiving 174 files, 87064163973 bytes total. Already received 91
files, 53930233899 bytes total
/var/lib/cassandra/xvdb/data/prod_analytics_events/wuevents-ffa99ad05af911e596f05987bbaaffad/prod_analytics_events-wuevents-tmp-ka-157-Data.db
17064156850/25823860172 bytes(66%) received from idx:0/52.6.136.30
/52.7.14.201 (using /10.239.209.46)
Receiving 164 files, 46351636573 bytes total. Already received 164
files, 46351636573 bytes total
/52.2.30.66 (using /10.239.209.18)
Receiving 158 files, 62899520151 bytes total. Already received 158
files, 62899520151 bytes total
/54.175.138.33 (using /10.239.209.96)
/54.88.44.178 (using /10.239.209.91)
/52.2.109.194 (using /10.239.208.89)
/54.172.81.117 (using /10.239.209.95)
/54.172.103.46 (using /10.239.209.48)
Receiving 164 files, 48771232182 bytes total. Already received 164
files, 48771232182 bytes total
/54.164.172.164 (using /10.239.209.94)
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name Active Pending Completed
Commands n/a 19 56
Responses n/a 0 51933813
{noformat}
No additional long GC pauses.
> OOM on bootstrap due to long GC pause
> -------------------------------------
>
> Key: CASSANDRA-10449
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10449
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Ubuntu 14.04, AWS
> Reporter: Robbie Strickland
> Labels: gc
> Fix For: 2.1.x
>
> Attachments: system.log.10-05
>
>
> I have a 20-node cluster (i2.4xlarge) with vnodes (default of 256) and
> 500-700GB per node. SSTable counts are <10 per table. I am attempting to
> provision additional nodes, but bootstrapping OOMs every time after about 10
> hours with a sudden long GC pause:
> {noformat}
> INFO [Service Thread] 2015-10-05 23:33:33,373 GCInspector.java:252 - G1 Old
> Generation GC in 1586126ms. G1 Old Gen: 49213756976 -> 49072277176;
> ...
> ERROR [MemtableFlushWriter:454] 2015-10-05 23:33:33,380
> CassandraDaemon.java:223 - Exception in thread
> Thread[MemtableFlushWriter:454,5,main]
> java.lang.OutOfMemoryError: Java heap space
> {noformat}
> I have tried increasing max heap to 48G just to get through the bootstrap, to
> no avail.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)