[jira] [Commented] (CASSANDRA-11847) Cassandra dies on a specific node in a multi-DC environment

2016-05-20 Thread Rajesh Babu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293886#comment-15293886
 ] 

Rajesh Babu commented on CASSANDRA-11847:
-

Thanks Jeff, for your inputs. I'll ask my customer to replace their hardware.

Rajesh

> Cassandra dies on a specific node in a multi-DC environment
> ---
>
> Key: CASSANDRA-11847
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11847
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction, Core
> Environment: Cassandra 2.0.11, JDK build 1.7.0_79-b15
>Reporter: Rajesh Babu
> Attachments: java_error19030.log, java_error2912.log, 
> java_error4571.log, java_error7539.log, java_error9552.log
>
>
> We've a customer who runs a 16 node 2 DC (8 nodes each) environment where 
> Cassandra pid dies randomly but on a specific node.
> Whenever Cassandra dies, admin has to manually restart Cassandra only on that 
> node.
> I tried upgrading their environment from java 1.7 (patch 60) to java 1.7 
> (patch 79) but it still seems to be an issue. 
> Is this a known hardware related bug or should is this issue fixed in later 
> Cassandra versions? 
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f4542d5a27f, pid=19030, tid=139933154096896
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 
> 1.7.0_79-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [libjava.so+0xe027f]  _fini+0xbd5f7
> #
> # Core dump written. Default location: /tmp/core or core.19030
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> ---  T H R E A D  ---
> Current thread (0x7f453c89f000):  JavaThread "COMMIT-LOG-WRITER" 
> [_thread_in_vm, id=19115, stack(0x7f44b9ed3000,0x7f44b9f14000)]
> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), 
> si_addr=0x7f4542d5a27f
> Registers:
> RAX=0x, RBX=0x7f453c564ad0, RCX=0x0001, 
> RDX=0x0020
> RSP=0x7f44b9f125a0, RBP=0x7f44b9f125b0, RSI=0x, 
> RDI=0x0001
> R8 =0x7f453c564ad8, R9 =0x4aab, R10=0x7f453917a52c, 
> R11=0x0006fae57068
> R12=0x7f453c564ad8, R13=0x7f44b9f125d0, R14=0x, 
> R15=0x7f453c89f000
> RIP=0x7f4542d5a27f, EFLAGS=0x00010246, CSGSFS=0x0033, 
> ERR=0x0014
>   TRAPNO=0x000e
> -
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f28e08787a4, pid=2912, tid=139798767699712
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 
> 1.7.0_79-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  0x7f28e08787a4
> #
> # Core dump written. Default location: /tmp/core or core.2912
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> ---  T H R E A D  ---
> Current thread (0x7f2640008000):  JavaThread "ValidationExecutor:15" 
> daemon [_thread_in_Java, id=7393, 
> stack(0x7f256fdf8000,0x7f256fe39000)]
> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), 
> si_addr=0x7f28e08787a4
> Registers:
> RAX=0x, RBX=0x3f8bb878, RCX=0xc77040d6, 
> RDX=0xc770409a
> RSP=0x7f256fe37430, RBP=0x00063b820710, RSI=0x00063b820530, 
> RDI=0x
> R8 =0x3f8bb888, R9 =0x, R10=0x3f8bb888, 
> R11=0x3f8bb878
> R12=0x, R13=0x00063b820530, R14=0x000b, 
> R15=0x7f2640008000
> RIP=0x7f28e08787a4, EFLAGS=0x00010246, CSGSFS=0x0033, 
> ERR=0x0015
>   TRAPNO=0x000e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11847) Cassandra dies on a specific node in a multi-DC environment

2016-05-19 Thread Rajesh Babu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291882#comment-15291882
 ] 

Rajesh Babu commented on CASSANDRA-11847:
-

It is a physical hardware (private cloud)

Manufacturer: Quanta Computer Inc
Product Name: QuantaPlex T41S-2U

I indeed thought initially it was a RAM related issue and I swapped the RAM on 
that node with "SAMSUNG 16GB 288-Pin DDR4 SDRAM ECC Registered DDR4 2133 (PC4 
17000) Server Memory Model M393A2G40DB0-CPB" but that didn't help either. 
Server was stable for 3 days or so and then again Cassandra died.

I just wanted to see if this issue is caused by Cassandra software (may be 
fixed in later versions, may be 2.0.17?)


> Cassandra dies on a specific node in a multi-DC environment
> ---
>
> Key: CASSANDRA-11847
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11847
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction, Core
> Environment: Cassandra 2.0.11, JDK build 1.7.0_79-b15
>Reporter: Rajesh Babu
> Attachments: java_error19030.log, java_error2912.log, 
> java_error4571.log, java_error7539.log, java_error9552.log
>
>
> We've a customer who runs a 16 node 2 DC (8 nodes each) environment where 
> Cassandra pid dies randomly but on a specific node.
> Whenever Cassandra dies, admin has to manually restart Cassandra only on that 
> node.
> I tried upgrading their environment from java 1.7 (patch 60) to java 1.7 
> (patch 79) but it still seems to be an issue. 
> Is this a known hardware related bug or should is this issue fixed in later 
> Cassandra versions? 
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f4542d5a27f, pid=19030, tid=139933154096896
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 
> 1.7.0_79-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [libjava.so+0xe027f]  _fini+0xbd5f7
> #
> # Core dump written. Default location: /tmp/core or core.19030
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> ---  T H R E A D  ---
> Current thread (0x7f453c89f000):  JavaThread "COMMIT-LOG-WRITER" 
> [_thread_in_vm, id=19115, stack(0x7f44b9ed3000,0x7f44b9f14000)]
> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), 
> si_addr=0x7f4542d5a27f
> Registers:
> RAX=0x, RBX=0x7f453c564ad0, RCX=0x0001, 
> RDX=0x0020
> RSP=0x7f44b9f125a0, RBP=0x7f44b9f125b0, RSI=0x, 
> RDI=0x0001
> R8 =0x7f453c564ad8, R9 =0x4aab, R10=0x7f453917a52c, 
> R11=0x0006fae57068
> R12=0x7f453c564ad8, R13=0x7f44b9f125d0, R14=0x, 
> R15=0x7f453c89f000
> RIP=0x7f4542d5a27f, EFLAGS=0x00010246, CSGSFS=0x0033, 
> ERR=0x0014
>   TRAPNO=0x000e
> -
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f28e08787a4, pid=2912, tid=139798767699712
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 
> 1.7.0_79-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  0x7f28e08787a4
> #
> # Core dump written. Default location: /tmp/core or core.2912
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> ---  T H R E A D  ---
> Current thread (0x7f2640008000):  JavaThread "ValidationExecutor:15" 
> daemon [_thread_in_Java, id=7393, 
> stack(0x7f256fdf8000,0x7f256fe39000)]
> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), 
> si_addr=0x7f28e08787a4
> Registers:
> RAX=0x, RBX=0x3f8bb878, RCX=0xc77040d6, 
> RDX=0xc770409a
> RSP=0x7f256fe37430, RBP=0x00063b820710, RSI=0x00063b820530, 
> RDI=0x
> R8 =0x3f8bb888, R9 =0x, R10=0x3f8bb888, 
> R11=0x3f8bb878
> R12=0x, R13=0x00063b820530, R14=0x000b, 
> R15=0x7f2640008000
> RIP=0x7f28e08787a4, EFLAGS=0x00010246, CSGSFS=0x0033, 
> ERR=0x0015
>   TRAPNO=0x000e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11847) Cassandra dies on a specific node in a multi-DC environment

2016-05-19 Thread Rajesh Babu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291687#comment-15291687
 ] 

Rajesh Babu commented on CASSANDRA-11847:
-

Cassandra system log indicates the below, before Cassandra process id dies

 INFO [CompactionExecutor:49] 2016-05-10 14:06:37,074 CompactionTask.java (line 
115) Compacting 
[SSTableReader(path='/var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-266-Data.db'),
 
SSTableReader(path='/var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-267-Data.db'),
 
SSTableReader(path='/var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-268-Data.db'),
 
SSTableReader(path='/var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-265-Data.db')]
 INFO [CompactionExecutor:49] 2016-05-10 14:06:37,191 CompactionTask.java (line 
287) Compacted 4 sstables to 
[/var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-269,].
  742,551 bytes to 256,142 (~34% of original) in 116ms = 2.105828MB/s.  7,348 
total partitions merged to 2,845.  Partition merge counts were {1:7348, }
 INFO [StorageServiceShutdownHook] 2016-05-10 14:11:16,693 ThriftServer.java 
(line 141) Stop listening to thrift clients
 INFO [StorageServiceShutdownHook] 2016-05-10 14:11:16,749 Server.java (line 
182) Stop listening for CQL clients
 INFO [StorageServiceShutdownHook] 2016-05-10 14:11:16,749 Gossiper.java (line 
1307) Announcing shutdown
 INFO [main] 2016-05-10 14:24:30,997 CassandraDaemon.java (line 135) Logging 
initialized
 INFO [main] 2016-05-10 14:24:31,028 YamlConfigurationLoader.java (line 80) 
Loading settings from 
file:/opt/cloudian-packages/apache-cassandra-2.0.11/conf/cassandra.yaml


> Cassandra dies on a specific node in a multi-DC environment
> ---
>
> Key: CASSANDRA-11847
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11847
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction, Core
> Environment: Cassandra 2.0.11, JDK build 1.7.0_79-b15
>Reporter: Rajesh Babu
> Attachments: java_error19030.log, java_error2912.log, 
> java_error4571.log, java_error7539.log, java_error9552.log
>
>
> We've a customer who runs a 16 node 2 DC (8 nodes each) environment where 
> Cassandra pid dies randomly but on a specific node.
> Whenever Cassandra dies, admin has to manually restart Cassandra only on that 
> node.
> I tried upgrading their environment from java 1.7 (patch 60) to java 1.7 
> (patch 79) but it still seems to be an issue. 
> Is this a known hardware related bug or should is this issue fixed in later 
> Cassandra versions? 
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f4542d5a27f, pid=19030, tid=139933154096896
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 
> 1.7.0_79-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [libjava.so+0xe027f]  _fini+0xbd5f7
> #
> # Core dump written. Default location: /tmp/core or core.19030
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> ---  T H R E A D  ---
> Current thread (0x7f453c89f000):  JavaThread "COMMIT-LOG-WRITER" 
> [_thread_in_vm, id=19115, stack(0x7f44b9ed3000,0x7f44b9f14000)]
> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), 
> si_addr=0x7f4542d5a27f
> Registers:
> RAX=0x, RBX=0x7f453c564ad0, RCX=0x0001, 
> RDX=0x0020
> RSP=0x7f44b9f125a0, RBP=0x7f44b9f125b0, RSI=0x, 
> RDI=0x0001
> R8 =0x7f453c564ad8, R9 =0x4aab, R10=0x7f453917a52c, 
> R11=0x0006fae57068
> R12=0x7f453c564ad8, R13=0x7f44b9f125d0, R14=0x, 
> R15=0x7f453c89f000
> RIP=0x7f4542d5a27f, EFLAGS=0x00010246, CSGSFS=0x0033, 
> ERR=0x0014
>   TRAPNO=0x000e
> -
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f28e08787a4, pid=2912, tid=139798767699712
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 
> 1.7.0_79-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  0x7f28e08787a4
> #
> # Core dump written. Default location: /tmp/core or core.2912
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> ---  T H R E A D  ---
> Current thread 

[jira] [Created] (CASSANDRA-11847) Cassandra dies on a specific node in a multi-DC environment

2016-05-19 Thread Rajesh Babu (JIRA)
Rajesh Babu created CASSANDRA-11847:
---

 Summary: Cassandra dies on a specific node in a multi-DC 
environment
 Key: CASSANDRA-11847
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11847
 Project: Cassandra
  Issue Type: Bug
  Components: Compaction, Core
 Environment: Cassandra 2.0.11, JDK build 1.7.0_79-b15
Reporter: Rajesh Babu
 Attachments: java_error19030.log, java_error2912.log, 
java_error4571.log, java_error7539.log, java_error9552.log

We've a customer who runs a 16 node 2 DC (8 nodes each) environment where 
Cassandra pid dies randomly but on a specific node.

Whenever Cassandra dies, admin has to manually restart Cassandra only on that 
node.

I tried upgrading their environment from java 1.7 (patch 60) to java 1.7 (patch 
79) but it still seems to be an issue. 


Is this a known hardware related bug or should is this issue fixed in later 
Cassandra versions? 

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f4542d5a27f, pid=19030, tid=139933154096896
#
# JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 1.7.0_79-b15)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [libjava.so+0xe027f]  _fini+0xbd5f7
#
# Core dump written. Default location: /tmp/core or core.19030
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---  T H R E A D  ---

Current thread (0x7f453c89f000):  JavaThread "COMMIT-LOG-WRITER" 
[_thread_in_vm, id=19115, stack(0x7f44b9ed3000,0x7f44b9f14000)]

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), 
si_addr=0x7f4542d5a27f

Registers:
RAX=0x, RBX=0x7f453c564ad0, RCX=0x0001, 
RDX=0x0020
RSP=0x7f44b9f125a0, RBP=0x7f44b9f125b0, RSI=0x, 
RDI=0x0001
R8 =0x7f453c564ad8, R9 =0x4aab, R10=0x7f453917a52c, 
R11=0x0006fae57068
R12=0x7f453c564ad8, R13=0x7f44b9f125d0, R14=0x, 
R15=0x7f453c89f000
RIP=0x7f4542d5a27f, EFLAGS=0x00010246, CSGSFS=0x0033, 
ERR=0x0014
  TRAPNO=0x000e


-

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f28e08787a4, pid=2912, tid=139798767699712
#
# JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 1.7.0_79-b15)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  0x7f28e08787a4
#
# Core dump written. Default location: /tmp/core or core.2912
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---  T H R E A D  ---

Current thread (0x7f2640008000):  JavaThread "ValidationExecutor:15" daemon 
[_thread_in_Java, id=7393, stack(0x7f256fdf8000,0x7f256fe39000)]

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=2 (SEGV_ACCERR), 
si_addr=0x7f28e08787a4

Registers:
RAX=0x, RBX=0x3f8bb878, RCX=0xc77040d6, 
RDX=0xc770409a
RSP=0x7f256fe37430, RBP=0x00063b820710, RSI=0x00063b820530, 
RDI=0x
R8 =0x3f8bb888, R9 =0x, R10=0x3f8bb888, 
R11=0x3f8bb878
R12=0x, R13=0x00063b820530, R14=0x000b, 
R15=0x7f2640008000
RIP=0x7f28e08787a4, EFLAGS=0x00010246, CSGSFS=0x0033, 
ERR=0x0015
  TRAPNO=0x000e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)