[ 
https://issues.apache.org/jira/browse/TRAFODION-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696091#comment-14696091
 ] 

Atanu Mishra commented on TRAFODION-325:
----------------------------------------

Matt Brown (mattbrown-2) wrote on 2014-05-29:   #1
Download full text (58.9 KiB)
Looks like we’re hitting the bug below. Affected Java versions are 7u10 and is 
fixed in 8.

http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8009460

See the “problematic frame” and stack trace below in bold

/opt/hp/squser2/gselva140525/dcs-0.7.0-beta> cat hs_err_pid28720.log
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007ffff6d875cd, pid=28720, tid=140737068099328
#
# JRE version: 7.0_09-b05
# Java VM: Java HotSpot(TM) 64-Bit Server VM (23.5-b02 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# V [libjvm.so+0x6995cd] MachNode::in_RegMask(unsigned int) const+0x3d
#
# Core dump written. Default location: 
/opt/hp/squser2/gselva140525/dcs-0.7.0-beta/core or core.28720
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
#

--------------- T H R E A D ---------------

Current thread (0x0000000000896800): JavaThread "C2 CompilerThread0" daemon 
[_thread_in_native, id=28758, stack(0x00007fffe6e36000,0x00007fffe6f37000)]

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), si_addr=0x0000000000000000

Registers:
RAX=0x00007ffff68d0db0, RBX=0x0000000000000001, RCX=0x00007fffe6f32980, 
RDX=0x0000000000000e4f
RSP=0x00007fffe6f32780, RBP=0x00007fffe6f327b0, RSI=0x0000000000000005, 
RDI=0x00007ffff733ce70
R8 =0x0000000000000060, R9 =0x00000000000000ff, R10=0x00000000000000ff, 
R11=0x00007ffff6a2c620
R12=0x0000000001c11740, R13=0x0000000000000001, R14=0x0000000000000005, 
R15=0x0000000000000005
RIP=0x00007ffff6d875cd, EFLAGS=0x0000000000010297, CSGSFS=0x0000000000000033, 
ERR=0x0000000000000000
  TRAPNO=0x000000000000000d

Top of Stack: (sp=0x00007fffe6f32780)
0x00007fffe6f32780: 00007fffe6f327b0 00000000018a7538
0x00007fffe6f32790: 0000000000fa7948 0000000000000005
0x00007fffe6f327a0: 0000000000000005 0000000001c11740
0x00007fffe6f327b0: 00007fffe6f32820 00007ffff6a33325
0x00007fffe6f327c0: 0000000000009850 00007fffe6f33f90
0x00007fffe6f327d0: 00000086000097f0 00000000010dd880
0x00007fffe6f327e0: 00000008e6f32820 ffffffff00000006
0x00007fffe6f327f0: 0000000001a9c8f0 00007fffe6f329e0
0x00007fffe6f32800: 0000000000000008 0000000000000040
0x00007fffe6f32810: 00007ffff73e08c0 00000000000001bb
0x00007fffe6f32820: 00007fffe6f32a30 00007ffff6a35b9f
0x00007fffe6f32830: 0000000002bfaf80 00007fffe6f32920
0x00007fffe6f32840: 00007fffe6f329e0 00007fffe6f32980
0x00007fffe6f32850: 00007fffe6f33f90 00000008000000d1
0x00007fffe6f32860: 000000000111e748 00007fffe6f329a0
0x00007fffe6f32870: 00007fffe6f329e0 00007fffe6f32a40
0x00007fffe6f32880: 00007fffe6f32930 00007ffff6d3f5e3
0x00007fffe6f32890: 0000000000000048 0000000700000004
0x00007fffe6f328a0: 00007fffe6f340f8 000000000111e738
0x00007fffe6f328b0: 000000070102b170 00007fff00000007
0x00007fffe6f328c0: 0000000000897270 00007ffff6f56496
0x00007fffe6f328d0: 000000000102b150 00007ffff6f563e7
0x00007fffe6f328e0: 00007fffe6f32930 00007ffff6918c61
0x00007fffe6f328f0: 0000000001d687f0 0000000000000000
0x00007fffe6f32900: 000000000000000c 00007ffff6f56496
0x00007fffe6f32910: 00007fffe6f3...

Changed in trafodion:
status: New → Confirmed
assignee:       nobody → Matt Brown (mattbrown-2)
Matt Brown (mattbrown-2) wrote on 2014-05-29:   #2
Looks like fix was backported to Java 7u40 that's Java 7 update 40. Spinel 
appears to be on Java 7 update 9.

Guy Groulx (guy-groulx) wrote on 2014-05-30:    #3
We've installed JVM 7 update 60.
Will see if problem returns. If not, will close.

Stacey Johnson (sjohnson-w) on 2014-06-10
information type:       Proprietary → Public
Guy Groulx (guy-groulx) wrote on 2014-06-10:    #4
Problem has not happened since switching to Java 7u40.

Changed in trafodion:
status: Confirmed → Fix Released


> LP Bug: 1324573 - DCS master died after many connections
> --------------------------------------------------------
>
>                 Key: TRAFODION-325
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-325
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: connectivity-dcs
>            Reporter: Guy Groulx
>            Assignee: Matt Brown
>            Priority: Blocker
>             Fix For: 1.0 (pre-incubation)
>
>
> On spinel, we've been pushing dcs connectivity.    We're now running up to 
> 1024 connections and may go higher.
> In a series of tests, each doing 1,2,4,8,16,32,64,128,256,512,1024 
> connections, the dcs master disappeared on us.
> core:  2014-05-29 07:11:33 /local/cores/1008/core.1401347493.n001.28720.java
> hs_err_pid:
> /opt/hp/squser2/gselva140525/dcs-0.7.0-beta> ls
> bin  conf  dcs-0.7.0-beta.jar  dcs-webapps  docs  hs_err_pid28720.log  lib  
> LICENSE.txt  logs  NOTICE.txt
> /opt/hp/squser2/gselva140525/dcs-0.7.0-beta> 
> logs:
> /opt/hp/squser2/gselva140525/dcs-0.7.0-beta/logs> cat 
> dcs-squser2-1-master-n001.out
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00007ffff6d875cd, pid=28720, tid=140737068099328
> #
> # JRE version: 7.0_09-b05
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (23.5-b02 mixed mode linux-amd64 
> compressed oops)
> # Problematic frame:
> # V  [libjvm.so+0x6995cd]  MachNode::in_RegMask(unsigned int) const+0x3d
> #
> # Core dump written. Default location: 
> /opt/hp/squser2/gselva140525/dcs-0.7.0-beta/core or core.28720
> #
> # An error report file with more information is saved as:
> # /opt/hp/squser2/gselva140525/dcs-0.7.0-beta/hs_err_pid28720.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> #
> /opt/hp/squser2/gselva140525/dcs-0.7.0-beta/logs> 
> /opt/hp/squser2/gselva140525/dcs-0.7.0-beta/logs> ls -lt | head
> total 72252
> -rw-r----- 1 squser2 seaquest    38961 May 29 11:37 
> dcs-squser2-815-server-n001.log
> -rw-r----- 1 squser2 seaquest      703 May 29 07:11 
> dcs-squser2-1-master-n001.out
> -rw-r----- 1 squser2 seaquest  4000199 May 29 07:11 
> dcs-squser2-1-master-n001.log  <== Does not have anything interesting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to