[
https://issues.apache.org/jira/browse/CASSANDRA-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740087#comment-16740087
]
HUANG DUICAN commented on CASSANDRA-14953:
------------------------------------------
Do you need to get cassandra.yaml?
I have uploaded cassandra.yaml。
Here are some parameters for startup:
/opt/huawei/apps/common/jdk/bin/java
-Xloggc:/opt/huawei/logs/cassandra/gc/gc.log -XX:+UseG1GC
-XX:MaxGCPauseMillis=500 -XX:InitiatingHeapOccupancyPercent=70
-XX:ParallelGCThreads=16 -XX:ConcGCThreads=16 -XX:+PrintGCDetails
-XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M
-Xms49152M -Xmx49152M -ea -Xss256k -XX:+AlwaysPreTouch -XX:-UseBiasedLocking
-XX:StringTableSize=1000003 -XX:+UseTLAB -XX:+ResizeTLAB
-XX:+PerfDisableSharedMem -XX:CompileCommandFile=./../conf/hotspot_compiler
-javaagent:./../lib/jamm-0.3.0.jar -XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOutOfMemoryError
-Djava.net.preferIPv4Stack=true -Dcassandra.jmx.remote.port=7199
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=true
-Dcom.sun.management.jmxremote.password.file=./../conf/jmxremote.password
-Dcom.sun.management.jmxremote.access.file=./../conf/jmxremote.access
-Djava.library.path=./../lib/sigar-bin
-Dcassandra.config.loader=com.huawei.sds.plugin.config.YamlExtendConfigurationLoader
-Dcassandra.libjemalloc=./../lib/jemalloc/lib/libjemalloc.so
-Dlogback.configurationFile=logback.xml
-javaagent:./../lib/sds-cassandra-jmx-plugin-1.0.0.jar
-Dcassandra.logdir=/opt/huawei/logs/cassandra/logs
-Dcassandra.storagedir=./../data -cp
./../conf:./../build/classes/main:./../build/classes/thrift:./../lib/airline-0.6.jar:./../lib/antlr-runtime-3.5.2.jar:./../lib/apache-cassandra-3.0.15.jar:./../lib/apache-cassandra-clientutil-3.0.15.jar:./../lib/apache-cassandra-thrift-3.0.15.jar:./../lib/asm-5.0.4.jar:./../lib/cassandra-driver-core-3.0.1-shaded.jar:./../lib/cassandra-lucene-index-builder-3.0.14.0.jar:./../lib/cassandra-lucene-index-plugin-3.0.14.0.jar:./../lib/commons-cli-1.1.jar:./../lib/commons-codec-1.2.jar:./../lib/commons-lang3-3.1.jar:./../lib/commons-math3-3.2.jar:./../lib/compress-lzf-0.8.4.jar:./../lib/concurrentlinkedhashmap-lru-1.4.jar:./../lib/disruptor-3.0.1.jar:./../lib/ecj-4.4.2.jar:./../lib/guava-18.0.jar:./../lib/high-scale-lib-1.0.6.jar:./../lib/jackson-core-asl-1.9.2.jar:./../lib/jackson-mapper-asl-1.9.2.jar:./../lib/jamm-0.3.0.jar:./../lib/javax.inject.jar:./../lib/jbcrypt-0.3m.jar:./../lib/jcl-over-slf4j-1.7.7.jar:./../lib/jna-4.2.2.jar:./../lib/joda-time-2.4.jar:./../lib/json-simple-1.1.jar:./../lib/jstackjunit-0.0.1.jar:./../lib/libthrift-0.9.2.jar:./../lib/log4j-over-slf4j-1.7.7.jar:./../lib/logback-classic-1.1.3.jar:./../lib/logback-core-1.1.3.jar:./../lib/lz4-1.3.0.jar:./../lib/metrics-core-3.1.0.jar:./../lib/metrics-jvm-3.1.0.jar:./../lib/metrics-logback-3.1.0.jar:./../lib/netty-all-4.0.44.Final.jar:./../lib/ohc-core-0.4.3.jar:./../lib/ohc-core-j8-0.4.3.jar:./../lib/reporter-config3-3.0.0.jar:./../lib/reporter-config-base-3.0.0.jar:./../lib/sds-cassandra-auth-plugin-1.0.0.jar:./../lib/sds-cassandra-jmx-plugin-1.0.0.jar:./../lib/sds-secondary-index-plugin-2.0.5.101.jar:./../lib/sigar-1.6.4.jar:./../lib/slf4j-api-1.7.7.jar:./../lib/snakeyaml-1.11.jar:./../lib/snappy-java-1.1.1.7.jar:./../lib/ST4-4.0.8.jar:./../lib/stream-2.5.2.jar:./../lib/thrift-server-0.3.7.jar:./../lib/jsr223/*/*.jar
org.apache.cassandra.service.CassandraDaemon
> Failed to reclaim the memory and too many MemtableReclaimMemory pending task
> ----------------------------------------------------------------------------
>
> Key: CASSANDRA-14953
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14953
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Memtable
> Environment: version : cassandra 2.1.15
> jdk: 8
> os:suse
> Reporter: HUANG DUICAN
> Priority: Major
> Attachments: 1.PNG, 2.PNG, cassandra.yaml, cassandra_20190105.zip
>
>
> We found that Cassandra has a lot of write accumulation in the production
> environment, and our business has experienced a lot of write failures.
> Through the system.log, it was found that MemtableReclaimMemory was pending
> at the beginning, and then a large number of MutationStage stacks appeared at
> a certain moment.
> Finally, the heap memory is full, the GC time reaches tens of seconds, the
> node status is DN through nodetool, but the Cassandra process is still
> running.We killed the node and restarted the node, and the above situation
> disappeared.
>
> Also the number of Active MemtableReclaimMemory threads seems to stay at 1.
> (you can see the 1.PNG)
> a large number of MutationStage stacks appeared at a certain moment.
> (you can see the 2.PNG)
>
> long GC time:
> - MemtableReclaimMemory 1 156 24565 0 0
> - G1 Old Generation GC in 87121ms. G1 Old Gen: 51175946656 -> 50082999760;
> - MutationStage 128 11931622 1983820772 0 0
> - CounterMutationStage 0 0 0 0 0
> - MemtableReclaimMemory 1 156 24565 0 0
> - G1 Young Generation GC in {color:#FF0000}969ms{color}. G1 Eden Space:
> 1090519040 -> 0; G1 Old Gen: 50082999760 -> 51156741584;
> - MutationStage 128 11953653 1983820772 0 0
> - CounterMutationStage 0 0 0 0 0
> - MemtableReclaimMemory 1 156 24565 0 0
> - G1 Old Generation GC in {color:#FF0000}84785ms{color}. G1 Old Gen:
> 51173518800 -> 50180911432;
> - MutationStage 128 11967484 1983820772 0 0
> - CounterMutationStage 0 0 0 0 0
> - MemtableReclaimMemory 1 156 24565 0 0
> - G1 Young Generation GC in 611ms. G1 Eden Space: 989855744 -> 0; G1 Old
> Gen: 50180911432 -> 51153989960;
> - MutationStage 128 11975849 1983820772 0 0
> - CounterMutationStage 0 0 0 0 0
> - MemtableReclaimMemory 1 156 24565 0 0
> - G1 Old Generation GC in {color:#FF0000}85845ms{color}. G1 Old Gen:
> 51170767176 -> 50238295416;
> - MutationStage 128 11978192 1983820772 0 0
> - CounterMutationStage 0 0 0 0 0
> - MemtableReclaimMemory 1 156 24565 0 0
> - G1 Young Generation GC in 602ms. G1 Eden Space: 939524096 -> 0; G1 Old
> Gen: 50238295416 -> 51161042296;
> - MutationStage 128 11994295 1983820772 0 0
> - CounterMutationStage 0 0 0 0 0
> - MemtableReclaimMemory 1 156 24565 0 0
> - G1 Old Generation GC in {color:#FF0000}85307ms{color}. G1 Old Gen:
> 51177819512 -> 50288829624; Metaspace: 36544536 -> 36525696
> - MutationStage 128 12001932 1983820772 0 0
> - CounterMutationStage 0 0 0 0 0
> 66 - MutationStage 128 12004395 1983820772 0 0
> 66 - CounterMutationStage 0 0 0 0 0
> - MemtableReclaimMemory 1 156 24565 0 0
> 66 - MemtableReclaimMemory 1 156 24565 0 0
> - G1 Young Generation GC in 610ms. G1 Eden Space: 889192448 -> 0; G1 Old
> Gen: 50288829624 -> 51178022072;
> - MutationStage 128 12023677 1983820772 0 0
> Why is this happening?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]