[jira] [Commented] (CASSANDRA-11847) Cassandra dies on a specific node in a multi-DC environment
[ https://issues.apache.org/jira/browse/CASSANDRA-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293886#comment-15293886 ] Rajesh Babu commented on CASSANDRA-11847: - Thanks Jeff, for your inputs. I'll ask my customer to replace their hardware. Rajesh > Cassandra dies on a specific node in a multi-DC environment > --- > > Key: CASSANDRA-11847 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11847 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Core > Environment: Cassandra 2.0.11, JDK build 1.7.0_79-b15 >Reporter: Rajesh Babu > Attachments: java_error19030.log, java_error2912.log, > java_error4571.log, java_error7539.log, java_error9552.log > > > We've a customer who runs a 16 node 2 DC (8 nodes each) environment where > Cassandra pid dies randomly but on a specific node. > Whenever Cassandra dies, admin has to manually restart Cassandra only on that > node. > I tried upgrading their environment from java 1.7 (patch 60) to java 1.7 > (patch 79) but it still seems to be an issue. > Is this a known hardware related bug or should is this issue fixed in later > Cassandra versions? > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f4542d5a27f, pid=19030, tid=139933154096896 > # > # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build > 1.7.0_79-b15) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # C [libjava.so+0xe027f] _fini+0xbd5f7 > # > # Core dump written. Default location: /tmp/core or core.19030 > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread (0x7f453c89f000): JavaThread "COMMIT-LOG-WRITER" > [_thread_in_vm, id=19115, stack(0x7f44b9ed3000,0x7f44b9f14000)] > siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), > si_addr=0x7f4542d5a27f > Registers: > RAX=0x, RBX=0x7f453c564ad0, RCX=0x0001, > RDX=0x0020 > RSP=0x7f44b9f125a0, RBP=0x7f44b9f125b0, RSI=0x, > RDI=0x0001 > R8 =0x7f453c564ad8, R9 =0x4aab, R10=0x7f453917a52c, > R11=0x0006fae57068 > R12=0x7f453c564ad8, R13=0x7f44b9f125d0, R14=0x, > R15=0x7f453c89f000 > RIP=0x7f4542d5a27f, EFLAGS=0x00010246, CSGSFS=0x0033, > ERR=0x0014 > TRAPNO=0x000e > - > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f28e08787a4, pid=2912, tid=139798767699712 > # > # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build > 1.7.0_79-b15) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # C 0x7f28e08787a4 > # > # Core dump written. Default location: /tmp/core or core.2912 > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread (0x7f2640008000): JavaThread "ValidationExecutor:15" > daemon [_thread_in_Java, id=7393, > stack(0x7f256fdf8000,0x7f256fe39000)] > siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), > si_addr=0x7f28e08787a4 > Registers: > RAX=0x, RBX=0x3f8bb878, RCX=0xc77040d6, > RDX=0xc770409a > RSP=0x7f256fe37430, RBP=0x00063b820710, RSI=0x00063b820530, > RDI=0x > R8 =0x3f8bb888, R9 =0x, R10=0x3f8bb888, > R11=0x3f8bb878 > R12=0x, R13=0x00063b820530, R14=0x000b, > R15=0x7f2640008000 > RIP=0x7f28e08787a4, EFLAGS=0x00010246, CSGSFS=0x0033, > ERR=0x0015 > TRAPNO=0x000e -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11847) Cassandra dies on a specific node in a multi-DC environment
[ https://issues.apache.org/jira/browse/CASSANDRA-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291882#comment-15291882 ] Rajesh Babu commented on CASSANDRA-11847: - It is a physical hardware (private cloud) Manufacturer: Quanta Computer Inc Product Name: QuantaPlex T41S-2U I indeed thought initially it was a RAM related issue and I swapped the RAM on that node with "SAMSUNG 16GB 288-Pin DDR4 SDRAM ECC Registered DDR4 2133 (PC4 17000) Server Memory Model M393A2G40DB0-CPB" but that didn't help either. Server was stable for 3 days or so and then again Cassandra died. I just wanted to see if this issue is caused by Cassandra software (may be fixed in later versions, may be 2.0.17?) > Cassandra dies on a specific node in a multi-DC environment > --- > > Key: CASSANDRA-11847 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11847 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Core > Environment: Cassandra 2.0.11, JDK build 1.7.0_79-b15 >Reporter: Rajesh Babu > Attachments: java_error19030.log, java_error2912.log, > java_error4571.log, java_error7539.log, java_error9552.log > > > We've a customer who runs a 16 node 2 DC (8 nodes each) environment where > Cassandra pid dies randomly but on a specific node. > Whenever Cassandra dies, admin has to manually restart Cassandra only on that > node. > I tried upgrading their environment from java 1.7 (patch 60) to java 1.7 > (patch 79) but it still seems to be an issue. > Is this a known hardware related bug or should is this issue fixed in later > Cassandra versions? > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f4542d5a27f, pid=19030, tid=139933154096896 > # > # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build > 1.7.0_79-b15) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # C [libjava.so+0xe027f] _fini+0xbd5f7 > # > # Core dump written. Default location: /tmp/core or core.19030 > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread (0x7f453c89f000): JavaThread "COMMIT-LOG-WRITER" > [_thread_in_vm, id=19115, stack(0x7f44b9ed3000,0x7f44b9f14000)] > siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), > si_addr=0x7f4542d5a27f > Registers: > RAX=0x, RBX=0x7f453c564ad0, RCX=0x0001, > RDX=0x0020 > RSP=0x7f44b9f125a0, RBP=0x7f44b9f125b0, RSI=0x, > RDI=0x0001 > R8 =0x7f453c564ad8, R9 =0x4aab, R10=0x7f453917a52c, > R11=0x0006fae57068 > R12=0x7f453c564ad8, R13=0x7f44b9f125d0, R14=0x, > R15=0x7f453c89f000 > RIP=0x7f4542d5a27f, EFLAGS=0x00010246, CSGSFS=0x0033, > ERR=0x0014 > TRAPNO=0x000e > - > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f28e08787a4, pid=2912, tid=139798767699712 > # > # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build > 1.7.0_79-b15) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # C 0x7f28e08787a4 > # > # Core dump written. Default location: /tmp/core or core.2912 > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread (0x7f2640008000): JavaThread "ValidationExecutor:15" > daemon [_thread_in_Java, id=7393, > stack(0x7f256fdf8000,0x7f256fe39000)] > siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), > si_addr=0x7f28e08787a4 > Registers: > RAX=0x, RBX=0x3f8bb878, RCX=0xc77040d6, > RDX=0xc770409a > RSP=0x7f256fe37430, RBP=0x00063b820710, RSI=0x00063b820530, > RDI=0x > R8 =0x3f8bb888, R9 =0x, R10=0x3f8bb888, > R11=0x3f8bb878 > R12=0x, R13=0x00063b820530, R14=0x000b, > R15=0x7f2640008000 > RIP=0x7f28e08787a4, EFLAGS=0x00010246, CSGSFS=0x0033, > ERR=0x0015 > TRAPNO=0x000e -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11847) Cassandra dies on a specific node in a multi-DC environment
[ https://issues.apache.org/jira/browse/CASSANDRA-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291687#comment-15291687 ] Rajesh Babu commented on CASSANDRA-11847: - Cassandra system log indicates the below, before Cassandra process id dies INFO [CompactionExecutor:49] 2016-05-10 14:06:37,074 CompactionTask.java (line 115) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-266-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-267-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-268-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-265-Data.db')] INFO [CompactionExecutor:49] 2016-05-10 14:06:37,191 CompactionTask.java (line 287) Compacted 4 sstables to [/var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-269,]. 742,551 bytes to 256,142 (~34% of original) in 116ms = 2.105828MB/s. 7,348 total partitions merged to 2,845. Partition merge counts were {1:7348, } INFO [StorageServiceShutdownHook] 2016-05-10 14:11:16,693 ThriftServer.java (line 141) Stop listening to thrift clients INFO [StorageServiceShutdownHook] 2016-05-10 14:11:16,749 Server.java (line 182) Stop listening for CQL clients INFO [StorageServiceShutdownHook] 2016-05-10 14:11:16,749 Gossiper.java (line 1307) Announcing shutdown INFO [main] 2016-05-10 14:24:30,997 CassandraDaemon.java (line 135) Logging initialized INFO [main] 2016-05-10 14:24:31,028 YamlConfigurationLoader.java (line 80) Loading settings from file:/opt/cloudian-packages/apache-cassandra-2.0.11/conf/cassandra.yaml > Cassandra dies on a specific node in a multi-DC environment > --- > > Key: CASSANDRA-11847 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11847 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Core > Environment: Cassandra 2.0.11, JDK build 1.7.0_79-b15 >Reporter: Rajesh Babu > Attachments: java_error19030.log, java_error2912.log, > java_error4571.log, java_error7539.log, java_error9552.log > > > We've a customer who runs a 16 node 2 DC (8 nodes each) environment where > Cassandra pid dies randomly but on a specific node. > Whenever Cassandra dies, admin has to manually restart Cassandra only on that > node. > I tried upgrading their environment from java 1.7 (patch 60) to java 1.7 > (patch 79) but it still seems to be an issue. > Is this a known hardware related bug or should is this issue fixed in later > Cassandra versions? > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f4542d5a27f, pid=19030, tid=139933154096896 > # > # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build > 1.7.0_79-b15) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # C [libjava.so+0xe027f] _fini+0xbd5f7 > # > # Core dump written. Default location: /tmp/core or core.19030 > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread (0x7f453c89f000): JavaThread "COMMIT-LOG-WRITER" > [_thread_in_vm, id=19115, stack(0x7f44b9ed3000,0x7f44b9f14000)] > siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), > si_addr=0x7f4542d5a27f > Registers: > RAX=0x, RBX=0x7f453c564ad0, RCX=0x0001, > RDX=0x0020 > RSP=0x7f44b9f125a0, RBP=0x7f44b9f125b0, RSI=0x, > RDI=0x0001 > R8 =0x7f453c564ad8, R9 =0x4aab, R10=0x7f453917a52c, > R11=0x0006fae57068 > R12=0x7f453c564ad8, R13=0x7f44b9f125d0, R14=0x, > R15=0x7f453c89f000 > RIP=0x7f4542d5a27f, EFLAGS=0x00010246, CSGSFS=0x0033, > ERR=0x0014 > TRAPNO=0x000e > - > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f28e08787a4, pid=2912, tid=139798767699712 > # > # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build > 1.7.0_79-b15) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # C 0x7f28e08787a4 > # > # Core dump written. Default location: /tmp/core or core.2912 > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread
[jira] [Created] (CASSANDRA-11847) Cassandra dies on a specific node in a multi-DC environment
Rajesh Babu created CASSANDRA-11847: --- Summary: Cassandra dies on a specific node in a multi-DC environment Key: CASSANDRA-11847 URL: https://issues.apache.org/jira/browse/CASSANDRA-11847 Project: Cassandra Issue Type: Bug Components: Compaction, Core Environment: Cassandra 2.0.11, JDK build 1.7.0_79-b15 Reporter: Rajesh Babu Attachments: java_error19030.log, java_error2912.log, java_error4571.log, java_error7539.log, java_error9552.log We've a customer who runs a 16 node 2 DC (8 nodes each) environment where Cassandra pid dies randomly but on a specific node. Whenever Cassandra dies, admin has to manually restart Cassandra only on that node. I tried upgrading their environment from java 1.7 (patch 60) to java 1.7 (patch 79) but it still seems to be an issue. Is this a known hardware related bug or should is this issue fixed in later Cassandra versions? # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f4542d5a27f, pid=19030, tid=139933154096896 # # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 1.7.0_79-b15) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [libjava.so+0xe027f] _fini+0xbd5f7 # # Core dump written. Default location: /tmp/core or core.19030 # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # --- T H R E A D --- Current thread (0x7f453c89f000): JavaThread "COMMIT-LOG-WRITER" [_thread_in_vm, id=19115, stack(0x7f44b9ed3000,0x7f44b9f14000)] siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), si_addr=0x7f4542d5a27f Registers: RAX=0x, RBX=0x7f453c564ad0, RCX=0x0001, RDX=0x0020 RSP=0x7f44b9f125a0, RBP=0x7f44b9f125b0, RSI=0x, RDI=0x0001 R8 =0x7f453c564ad8, R9 =0x4aab, R10=0x7f453917a52c, R11=0x0006fae57068 R12=0x7f453c564ad8, R13=0x7f44b9f125d0, R14=0x, R15=0x7f453c89f000 RIP=0x7f4542d5a27f, EFLAGS=0x00010246, CSGSFS=0x0033, ERR=0x0014 TRAPNO=0x000e - # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f28e08787a4, pid=2912, tid=139798767699712 # # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 1.7.0_79-b15) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode linux-amd64 compressed oops) # Problematic frame: # C 0x7f28e08787a4 # # Core dump written. Default location: /tmp/core or core.2912 # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # --- T H R E A D --- Current thread (0x7f2640008000): JavaThread "ValidationExecutor:15" daemon [_thread_in_Java, id=7393, stack(0x7f256fdf8000,0x7f256fe39000)] siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), si_addr=0x7f28e08787a4 Registers: RAX=0x, RBX=0x3f8bb878, RCX=0xc77040d6, RDX=0xc770409a RSP=0x7f256fe37430, RBP=0x00063b820710, RSI=0x00063b820530, RDI=0x R8 =0x3f8bb888, R9 =0x, R10=0x3f8bb888, R11=0x3f8bb878 R12=0x, R13=0x00063b820530, R14=0x000b, R15=0x7f2640008000 RIP=0x7f28e08787a4, EFLAGS=0x00010246, CSGSFS=0x0033, ERR=0x0015 TRAPNO=0x000e -- This message was sent by Atlassian JIRA (v6.3.4#6332)