Andrew Sherman created IMPALA-10592:
---------------------------------------

             Summary: Exhaustive tests timeout after 20 hours
                 Key: IMPALA-10592
                 URL: https://issues.apache.org/jira/browse/IMPALA-10592
             Project: IMPALA
          Issue Type: Bug
    Affects Versions: Impala 4.0
            Reporter: Andrew Sherman
         Attachments: catalogd_8661_20210317-045247.txt, 
hms_16762_jstack_20210317-045247.txt, impalad_8744_20210317-045247.txt.gz, 
impalad_8744_jstack_20210317-045312.txt, impalad_8747_20210317-045247.txt.gz, 
impalad_8754_20210317-045247.txt, namenode_10515_jstack_20210317-045247.txt, 
statestored_8645_20210317-045247.txt

The tests seem to make progress for nearly 10 hours, but after 20 hours they 
timeout
{code}
**** run-all-tests.sh TIMED OUT! ****
{code}
The timeout stack traces are attached

Impala logs show a long period of inactivity between  03/16 16:58 and 03/17 
04:53
For example:
{code}
I0316 16:56:33.555305  9911 impala-server.cc:1996] Catalog topic update applied 
with version: 65701 new min catalog object version: 36078
I0316 16:58:00.504211  9041 krpc-data-stream-mgr.cc:427] Reduced stream ID 
cache from 6 items, to 5, eviction took: 0
I0316 16:58:10.504297  9041 krpc-data-stream-mgr.cc:427] Reduced stream ID 
cache from 5 items, to 4, eviction took: 0
I0316 16:58:20.504348  9041 krpc-data-stream-mgr.cc:427] Reduced stream ID 
cache from 4 items, to 3, eviction took: 0
I0316 16:58:30.504386  9041 krpc-data-stream-mgr.cc:427] Reduced stream ID 
cache from 3 items, to 2, eviction took: 0
I0316 16:58:40.504467  9041 krpc-data-stream-mgr.cc:427] Reduced stream ID 
cache from 2 items, to 1, eviction took: 0
I0316 16:58:50.504545  9041 krpc-data-stream-mgr.cc:427] Reduced stream ID 
cache from 1 items, to 0, eviction took: 0
I0317 04:53:06.368000  9905 TAcceptQueueServer.cpp:340] New connection to 
server StatestoreSubscriber from client <Host: ::ffff:127.0.0.1 Port: 32818>
I0317 04:53:06.368041  9910 thrift-util.cc:96] TSocket::write_partial() send() 
<Host: ::ffff:127.0.0.1 Port: 36850>Broken pipe
W0317 04:53:06.369092  8780 init.cc:214] A process pause was detected for 
approximately 18s920ms
I0317 04:53:06.369904  9905 TAcceptQueueServer.cpp:340] New connection to 
server StatestoreSubscriber from client <Host: ::ffff:127.0.0.1 Port: 32822>
I0317 04:53:06.369961  9910 thrift-util.cc:96] TAcceptQueueServer client died: 
write() send(): Broken pipe
W0317 04:53:06.369966  8929 JvmPauseMonitor.java:205] Detected pause in JVM or 
host machine (eg GC): pause of approximately 18338ms
No GCs detected
I0317 04:53:06.370081 27248 thrift-util.cc:96] TSocket::write_partial() send() 
<Host: ::ffff:127.0.0.1 Port: 32818>Broken pipe
I0317 04:53:06.370126 27248 thrift-util.cc:96] TAcceptQueueServer client died: 
write() send(): Broken pipe
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to