[ https://issues.apache.org/jira/browse/TRAFODION-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669026#comment-16669026 ]
Selvaganesan Govindarajan commented on TRAFODION-3225: ------------------------------------------------------ Logger infrastructure in Trafodion is implemented via an internal class QRLogger. QRLogger is a singleton instance in a process. It was suspected that the singleton instance was getting clobbered somehow leading to all kinds of issues with the logger > Obscure cores seen in RMS and logger related code when Trafodion is stressed > ---------------------------------------------------------------------------- > > Key: TRAFODION-3225 > URL: https://issues.apache.org/jira/browse/TRAFODION-3225 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe > Reporter: Selvaganesan Govindarajan > Assignee: Selvaganesan Govindarajan > Priority: Major > > During stress testing of enterprise edition of Trafodion, the following > problems are seen. > {color:#000000}Thread 1 (Thread 0x7efee4046700 (LWP 26304)):{color} > {color:#000000}#0 0x00007eff1ad045f7 in raise () from /lib64/libc.so.6{color} > {color:#000000}#1 0x00007eff1ad05e28 in abort () from /lib64/libc.so.6{color} > {color:#000000}#2 0x00007eff1609b94e in assert_botch_abend (f=0x7eff1a7eac75 > "../cli/Statement.cpp", l=6178, m=0x7eff1a7eb9e0 "StmtStats_ is null after > addQuery", c=0x0) at ../export/NAAbort.cpp:285{color} > {color:#000000}#3 0x00007eff1a777df9 in Statement::setStmtStats > (this=0x7efee2bd94d0, autoRetry=0) at ../cli/Statement.cpp:6178{color} > {color:#000000}#4 0x00007eff1a6d08ed in SQLCLI_ExecDirect2(CliGlobals *, > SQLSTMT_ID *, SQLDESC_ID *, Int32, SQLDESC_ID *, Lng32, typedef __va_list_tag > __va_list_tag *, SQLCLI_PTR_PAIRS *) (cliGlobals=0x22691d0, > statement_id=0x4d45068, sql_source=0x7efee40420f0, prepFlags=0, > input_descriptor=0x0, num_ptr_pairs=0, ap=0x7efee4041e90, ptr_pairs=0x0) at > ../cli/Cli.cpp:3317{color} > {color:#000000}#5 0x00007eff1a789c88 in SQL_EXEC_ExecDirect2 > (statement_id=0x4d45068, sql_source=0x7efee40420f0, prep_flags=0, > input_descriptor=0x0, num_ptr_pairs=0) at ../cli/CliExtern.cpp:2090{color} > {color:#000000}#6 0x00007eff1d4c30ab in SRVR::WSQL_EXEC_ExecDirect > (statement_id=0x4d45068, sql_source=0x7efee40420f0, input_descriptor=0x0, > num_ptr_pairs=0) at SQLWrapper.cpp:364{color} > {color:#000000}#7 0x00007eff1d4aaa8b in SRVR::EXECDIRECT > (pSrvrStmt=0x4d44a50) at sqlinterface.cpp:4700{color} > {color:#000000}#8 0x00007eff1d438280 in SRVR::ControlProc (pParam=0x4d44a50) > at csrvrstmt.cpp:768{color} > {color:#000000}#9 0x00007eff1d4378b7 in SRVR_STMT_HDL::ExecDirect > (this=0x4d44a50, inCursorName=0x0, inSqlString=0x5f64aa8 "update > Trafodion.\"_REPOS_\".metric_query_aggr_table set > AGGREGATION_LAST_UPDATE_UTC_TS = > CONVERTTIMESTAMP(212406077134960312),AGGREGATION_LAST_ELAPSED_TIME = > 60000,TOTAL_EST_ROWS_ACCESSED = 0,TOTAL_EST"..., inStmtType=1, > inSqlStmtType=0, inSqlAsyncEnable=0, inQueryTimeout=0) at > csrvrstmt.cpp:450{color} > {color:#000000}#10 0x000000000056f059 in SessionWatchDog (arg=0x0) at > SrvrConnect.cpp:1194{color} > {color:#000000}#11 0x00007eff1dbe1dc5 in start_thread () from > /lib64/libpthread.so.0{color} > {color:#000000}#12 0x00007eff1adc5ced in clone () from /lib64/libc.so.6{color} > > {color:#000000}And other obscure cores related to ExStatisticsArea.{color} > {color:#000000} {color} > {color:#000000}The logger infrastructure fails with the following stack trace > or some other variations in the logger code.{color} > > #0 0x00007f4afb8bb495 in raise () from /lib64/libc.so.6 > #1 0x00007f4afb8bcc75 in abort () from /lib64/libc.so.6 > #2 0x00007f4afa705a8d in __gnu_cxx::__verbose_terminate_handler() () from > /usr/lib64/libstdc++.so.6 > #3 0x00007f4afa703be6 in ?? () from /usr/lib64/libstdc++.so.6 > #4 0x00007f4afa703c13 in std::terminate() () from /usr/lib64/libstdc++.so.6 > #5 0x00007f4afa70456f in __cxa_pure_virtual () from /usr/lib64/libstdc++.so.6 > #6 0x00007f4afb203f60 in ?? () from > /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-3.b14.el6_9.x86_64/jre/lib/amd64/server/libjvm.so > #7 0x00007f4afb38ba9f in ?? () from > /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-3.b14.el6_9.x86_64/jre/lib/amd64/server/libjvm.so > #8 0x00007f4afb38d47f in ?? () from > /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-3.b14.el6_9.x86_64/jre/lib/amd64/server/libjvm.so > #9 0x00007f4afb200ef2 in JVM_handle_linux_signal () from > /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-3.b14.el6_9.x86_64/jre/lib/amd64/server/libjvm.so > #10 0x00007f4afb1f6753 in ?? () from > /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-3.b14.el6_9.x86_64/jre/lib/amd64/server/libjvm.so > #11 <signal handler called> > #12 0x00007f4ad6561e2c in log4cxx::helpers::Transcoder::decode (src= > > "SQL.HBas0\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\200yh\002\000\000\000\000\b\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.HDFS`\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\020eh\002\000\000\000\000\016\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.EXE.0\005\000\000\000\000\000\000@\000\000\000\000\000\000\000\200\210\210\004\000\000\000\000\a\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.Qmp\000\376\377\377\377\066\300\n\360org.apacp\005\000\000\000\000\000\000\060\000\000\000\000\000\000\000\017\000\000\000\000\000\000\000\017", > '\000' <repeats 15 times>, "orc_proto.proto\000A", '\000' <repeats 11 > times>..., > dst="SQL.HBas0\000\000\000\000\000\000\000\060\000\000\000\000\000\000") > at transcoder.cpp:261 > #13 0x00007f4ad6517ca1 in log4cxx::LogManager::getLogger (name=<value > optimized out>) at logmanager.cpp:120 > #14 0x00007f4ad6510c49 in log4cxx::Logger::getLogger (name=<value optimized > out>) at logger.cpp:490 > #15 0x00007f4ad30fc9a1 in QRLogger::log (cat= > > "SQL.HBas0\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\200yh\002\000\000\000\000\b\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.HDFS`\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\020eh\002\000\000\000\000\016\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.EXE.0\005\000\000\000\000\000\000@\000\000\000\000\000\000\000\200\210\210\004\000\000\000\000\a\000\000\000\000\000\000\000\377\377\377\377\000\000\000\000SQL.Qmp\000\376\377\377\377\066\300\n\360org.apacp\005\000\000\000\000\000\000\060\000\000\000\000\000\000\000\017\000\000\000\000\000\000\000\017", > '\000' <repeats 15 times>, "orc_proto.proto\000A", '\000' <repeats 11 > times>..., level=LL_DEBUG, > logMsgTemplate=0x7f4ad96d2430 "ExpHbaseInterface_JNI::init() creating new > client.") at ../qmscommon/QRLogger.cpp:567 > #16 0x00007f4ad8e1e6b4 in ExpHbaseInterface_JNI::init (this=0x7f4ac7ef4870, > hbs=0x0) at ../exp/ExpHbaseInterface.cpp:488 > > > {color:#000000} {color} > {color:#000000} {color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)