[ 
https://issues.apache.org/jira/browse/TRAFODION-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213070#comment-16213070
 ] 

Selvaganesan Govindarajan commented on TRAFODION-2780:
------------------------------------------------------

Thanks Rao. Currently, BreakDialogue is getting called in listener thread,  
this is the SQL thread as well But, reading the BreakDialogue code it doesn’t 
seem to switch to this thread ( also it can’t because this listener thread is 
stuck either in doing SQL work or waiting for the message from the client)

The other solution could be to do select with timeout equivalent to connect 
idle timer. If the select times out, it is possible to breakDiagloue or 
whatever the method connectIdleTimer was called.

Let me make a change and try it out.

Selva


> Mxosrvr dumps core when connection idle timer expires at times
> --------------------------------------------------------------
>
>                 Key: TRAFODION-2780
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2780
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: connectivity-mxosrvr
>            Reporter: Selvaganesan Govindarajan
>            Assignee: Selvaganesan Govindarajan
>
> Mxosrvr dumps core at times when the connection idle timer expires with the 
> following stack trace. This core is accompanied by mxssmp core.
> Thread 1 (Thread 0x7f95e8cd4a00 (LWP 47052)):
> #0  0x00007f95e4b135f7 in raise () from /lib64/libc.so.6
> #1  0x00007f95e4b14e28 in abort () from /lib64/libc.so.6
> #2  0x00007f95e1158bef in assert_botch_abend (f=f@entry=0x7f95e2ee50d7 
> "../executor/ex_root.cpp", l=l@entry=3055, 
>     m=m@entry=0x7f95e2ee5338 "Timeout waiting for control broker.", 
> c=c@entry=0x0) at ../export/NAAbort.cpp:277
> #3  0x00007f95e2da911b in ex_root_tcb::cbMessageWait (this=0x7f95af1d4d78, 
>     waitStartTime=waitStartTime@entry=212373626065770962) at 
> ../executor/ex_root.cpp:3055
> #4  0x00007f95e44515c4 in CliStatement::releaseTransaction 
> (this=this@entry=0x7f95e8b33d70, 
>     allWorkRequests=allWorkRequests@entry=1, 
> alwaysSendReleaseMsg=alwaysSendReleaseMsg@entry=0, 
>     statementRemainsOpen=statementRemainsOpen@entry=0) at 
> ../cli/Statement.cpp:965
> #5  0x00007f95e4451990 in CliStatement::releaseTcbs 
> (this=this@entry=0x7f95e8b33d70, 
>     closeAllOpens=closeAllOpens@entry=0) at ../cli/Statement.cpp:4306
> #6  0x00007f95e4451b33 in CliStatement::dealloc 
> (this=this@entry=0x7f95e8b33d70, 
>     closeAllOpens=closeAllOpens@entry=0) at ../cli/Statement.cpp:4394
> #7  0x00007f95e445269a in CliStatement::close 
> (this=this@entry=0x7f95e8b33d70, diagsArea=..., 
>     inRollback=inRollback@entry=0) at ../cli/Statement.cpp:1140
> #8  0x00007f95e4411704 in SQLCLI_PerformTasks(CliGlobals *, ULng32, 
> SQLSTMT_ID *, SQLDESC_ID *, SQLDESC_ID *, Lng32, Lng32, typedef __va_list_tag 
> __va_list_tag *, SQLCLI_PTR_PAIRS *, SQLCLI_PTR_PAIRS *) 
> (cliGlobals=<optimized out>, 
>     tasks=1800, statement_id=<optimized out>, 
> input_descriptor=input_descriptor@entry=0x0, 
>     output_descriptor=output_descriptor@entry=0x0, 
> num_input_ptr_pairs=num_input_ptr_pairs@entry=0, 
>     num_output_ptr_pairs=num_output_ptr_pairs@entry=0, ap=ap@entry=0x0, 
> input_ptr_pairs=input_ptr_pairs@entry=0x0, 
>     output_ptr_pairs=output_ptr_pairs@entry=0x0) at ../cli/Cli.cpp:3465
> #9  0x00007f95e4411d04 in SQLCLI_CloseStmt (cliGlobals=<optimized out>, 
> statement_id=<optimized out>)
>     at ../cli/Cli.cpp:3518
> #10 0x00007f95e445c83f in SQL_EXEC_CloseStmt (statement_id=0x10eed778) at 
> ../cli/CliExtern.cpp:1432
> #11 0x00007f95e7379757 in SRVR::releaseCachedObject 
> (internalStmt=internalStmt@entry=0, 
>     mxsrvr_substate=mxsrvr_substate@entry=NDCS_CONN_IDLE) at 
> srvrcommon.cpp:764
> #12 0x00000000004bdf45 in SRVR::connIdleTimerExpired (timer_tag=<optimized 
> out>) at SrvrConnect.cpp:4648
> #13 0x0000000000490362 in BUILD_TIMER_MSG_CALL (call_id_=<optimized out>, 
> request=<optimized out>, 
>     countRead=<optimized out>, receive_info=<optimized out>) at 
> ../Common/FileSystemSrvr.cpp:601
> #14 0x0000000000492075 in CNSKListener::CheckReceiveMessage (this=0x263bea0, 
> cc=@0x7ffc86677a84: 6, countRead=16, 
>     call_id=<optimized out>) at ../Common/Listener.cpp:272
> #15 0x000000000049b14e in CNSKListenerSrvr::runProgram (this=0x263bea0, 
> TcpProcessName=<optimized out>, 
>     port=<optimized out>, TransportTrace=<optimized out>) at 
> Interface/linux/Listener_srvr_ps.cpp:508
> #16 0x0000000000483cc6 in runCEE (TransportTrace=0, portNumber=<optimized 
> out>, 
>     TcpProcessName=0x7ffc866789d0 "$ZTC0") at SrvrMain.cpp:167
> #17 main (argc=39, argv=0x7ffc8667a948, envp=<optimized out>) at 
> SrvrMain.cpp:864



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to