Any update on it?
Jack Huang Dell EMC | CTD MRES Cyclone Group mobile +86-13880577652<tel:+86-13880577652> jack.hu...@dell.com<mailto:jack.hu...@dell.com> From: Huang, Jack [mailto:jack.hu...@dell.com] Sent: Monday, September 25, 2017 9:39 AM To: user@trafodion.incubator.apache.org Subject: RE: trafodion mxosrvr down Hi, Please see the ouput and attached dump file. [root@trafodion apache-trafodion-2.1.0]# ls -lt core.* -rw-------. 1 trafodion trafodion 2134945792 Sep 24 00:46 core.54809 -rw-------. 1 trafodion trafodion 2100211712 Sep 24 00:43 core.43521 -rw-------. 1 trafodion trafodion 2109403136 Sep 24 00:43 core.52966 -rw-------. 1 trafodion trafodion 2096242688 Sep 24 00:43 core.38905 -rw-------. 1 trafodion trafodion 2102181888 Sep 24 00:43 core.45648 -rw-r--r--. 1 trafodion trafodion 271522520 Jun 21 09:48 core.2017-06-21_09-48-29.ZSM000.16632.mxssmp -rw-r--r--. 1 trafodion trafodion 4030914880 Jun 21 09:36 core.2017-06-21_09-36-27.Z000T2R.34396.tdm_udrserv [trafodion@trafodion ~]$ sqvers -u TRAF_HOME=/home/trafodion/apache-trafodion-2.1.0 who@host=trafodion@trafodion JAVA_HOME=/usr/jdk64/jdk1.8.0_60 linux=2.6.32-642.el6.x86_64 redhat=6.8 NO patches Most common Apache_Trafodion Release 2.1.0 (Build release [2.1.0-0-gdc3d97f], branch 20170406-no_branch, date 06Apr17) UTT count is 2 [4] Apache Trafodion Release 2.1.0 (Build release [2.1.0-0-gdc3d97f], branch release2.1, date 06Apr17) export/lib/jdbcT2.jar export/lib/jdbcT4-2.1.0.jar export/lib/jdbcT4.jar export/lib/lib_mgmt.jar [15] Apache_Trafodion Release 2.1.0 (Build release [2.1.0-0-gdc3d97f], branch release2.1, date 06Apr17) export/lib/hbase-trx-apache1_0-2.1.0.jar export/lib/hbase-trx-apache1_1-2.1.0.jar export/lib/hbase-trx-apache1_2-2.1.0.jar export/lib/hbase-trx-cdh5_4-2.1.0.jar export/lib/hbase-trx-cdh5_5-2.1.0.jar export/lib/hbase-trx-cdh5_7-2.1.0.jar export/lib/hbase-trx-hdp2_3-2.1.0.jar export/lib/sqmanvers.jar export/lib/trafodion-dtm-apache-2.1.0.jar export/lib/trafodion-dtm-cdh-2.1.0.jar export/lib/trafodion-dtm-hdp-2.1.0.jar export/lib/trafodion-sql-apache-2.1.0.jar export/lib/trafodion-sql-cdh-2.1.0.jar export/lib/trafodion-sql-hdp-2.1.0.jar export/lib/trafodion-utility-2.1.0.jar [trafodion@trafodion ~]$ sqcheck *** Checking Trafodion Environment *** Checking if processes are up. Checking attempt: 1; user specified max: 2. Execution time in seconds: 1. The Trafodion environment is up! Process Configured Actual Down ------- ---------- ------ ---- DTM 2 2 RMS 4 4 DcsMaster 1 1 DcsServer 1 0 1 mxosrvr 256 2 254 RestServer 1 1 Jack Huang Dell EMC | CTD MRES Cyclone Group mobile +86-13880577652<tel:+86-13880577652> jack.hu...@dell.com<mailto:jack.hu...@dell.com> From: Selva Govindarajan [mailto:selva.govindara...@esgyn.com] Sent: Friday, September 22, 2017 1:58 AM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: RE: trafodion mxosrvr down Can you please get the stack trace of some of the core files. In the directory where core files are found, issue ls -lt core.* The cores with earlier timestamp will be displayed at the end. gdb mxosrvr <core_file> thread apply all bt And send the stack trace of few of these core files? Please issue at the shell prompt to get the version of Trafodion installed. sqvers -u and send the output of this command too. Selva From: Liu, Yuan (Yuan) [mailto:yuan....@esgyn.cn] Sent: Thursday, September 21, 2017 12:05 AM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: RE: trafodion mxosrvr down Mxosrvr is managed by dcsserver. You should run dcsstop/dcsstart and restart dcs, or just run dcsstart. Best regards, Yuan From: Huang, Jack [mailto:jack.hu...@dell.com] Sent: Thursday, September 21, 2017 2:26 PM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: RE: trafodion mxosrvr down Several core dump found. Does anyone how to restart the mxosrvr? [root@trafodion apache-trafodion-2.1.0]# file core.40973 core.40973: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'mxosrvr -ZKHOST trafodion:2181 -RZ trafodion:1:24 -ZKPNODE /trafodion -CNGTO 60', real uid: 1003, effective uid: 1003, real gid: 502, effective gid: 502, execfn: '/home/trafodion/apache-trafodion-2.1.0/export/bin64/mxosrvr', platform: 'x86_64' [root@trafodion apache-trafodion-2.1.0]# gdb /home/trafodion/apache-trafodion-2.1.0/export/bin64/mxosrvr core.40973 Core was generated by `mxosrvr -ZKHOST trafodion:2181 -RZ trafodion:1:24 -ZKPNODE /trafodion -CNGTO 60'. Program terminated with signal 6, Aborted. #0 0x0000003b30a325e5 in raise () from /lib64/libc.so.6 Jack Huang Dell EMC | CTD MRES Cyclone Group mobile +86-13880577652<tel:+86-13880577652> jack.hu...@dell.com<mailto:jack.hu...@dell.com> From: Huang, Jack [mailto:jack.hu...@dell.com] Sent: Thursday, September 21, 2017 2:18 PM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: trafodion mxosrvr down Hi trafodioner, I run the trafodion database workload with HammerDB about 40 hours. Initially the mxosrvr actual is 254, but now they are all down. Would you help to triage it? Now the database can not receive any connection. [trafodion@trafodion ~]$ sqcheck *** Checking Trafodion Environment *** Checking if processes are up. Checking attempt: 1; user specified max: 2. Execution time in seconds: 0. The Trafodion environment is up! Process Configured Actual Down ------- ---------- ------ ---- DTM 2 2 RMS 4 4 DcsMaster 1 1 DcsServer 1 0 1 mxosrvr 256 2 254 RestServer 1 1 Jack Huang Dell EMC | CTD MRES Cyclone Group mobile +86-13880577652<tel:+86-13880577652> jack.hu...@dell.com<mailto:jack.hu...@dell.com>