[ 
https://issues.apache.org/jira/browse/TEZ-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17492682#comment-17492682
 ] 

László Bodor edited comment on TEZ-4388 at 2/17/22, 10:44 AM:
--------------------------------------------------------------

looks like the this thread runs forever in the JVM:
{code}
"AsyncHttpClient-3-1" #38 prio=5 os_prio=31 tid=0x00007ff3cf981800 nid=0x9603 
runnable [0x0000700007f8b000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
        at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
        at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x00000007b6f85bd0> (a 
io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000007b6f85be8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000007b6f85b80> (a sun.nio.ch.KQueueSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
        at 
io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:68)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:813)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)

"AsyncHttpClient-timer-1-1" #37 prio=5 os_prio=31 tid=0x00007ff3cebad800 
nid=0x6903 waiting on condition [0x0000700007e88000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at 
io.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:600)
        at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:496)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)
{code}

I think HashedWheelTimer ignores interrupted exception and expects an explicit 
stop through the async http client
{code}
        private long waitForNextTick() {
...
                try {
                    Thread.sleep(sleepTimeMs);
                } catch (InterruptedException ignored) {
                    if (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == 
WORKER_STATE_SHUTDOWN) {
                        return Long.MIN_VALUE;
                    }
                }

{code}


was (Author: abstractdog):
looks like the this thread runs forever in the JVM:
{code}
"AsyncHttpClient-3-1" #38 prio=5 os_prio=31 tid=0x00007ff3cf981800 nid=0x9603 
runnable [0x0000700007f8b000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
        at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
        at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x00000007b6f85bd0> (a 
io.netty.channel.nio.SelectedSelectionKeySet)
        - locked <0x00000007b6f85be8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000007b6f85b80> (a sun.nio.ch.KQueueSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
        at 
io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:68)
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:813)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)

"AsyncHttpClient-timer-1-1" #37 prio=5 os_prio=31 tid=0x00007ff3cebad800 
nid=0x6903 waiting on condition [0x0000700007e88000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at 
io.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:600)
        at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:496)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)
{code}

> TestSecureShuffle: TezChild processes keep running after test
> -------------------------------------------------------------
>
>                 Key: TEZ-4388
>                 URL: https://issues.apache.org/jira/browse/TEZ-4388
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>         Attachments: jstack.log
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
> laszlobodor      96935   0.0  1.6  8398064 553604 s001  S     2:14PM   
> 0:06.21 
> /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/bin/java 
> -Xmx819m -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc 
> -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC 
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir=/Users/laszlobodor/apache/tez/tez-tests/target/tmp/org.apache.tez.test.TestSecureShuffle-withoutssl/yarn-466119913/org.apache.tez.test.TestSecureShuffle-withoutssl-logDir-nm-0_0/application_1644930832286_0002/container_1644930832286_0002_01_000002
>  -Dtez.root.logger=INFO,CLA 
> -Djava.io.tmpdir=/Users/laszlobodor/apache/tez/tez-tests/target/tmp/org.apache.tez.test.TestSecureShuffle-withoutssl/yarn-466119913/org.apache.tez.test.TestSecureShuffle-withoutssl-localDir-nm-0_0/usercache/laszlobodor/appcache/application_1644930832286_0002/container_1644930832286_0002_01_000002/tmp
>  org.apache.tez.runtime.task.TezChild 192.168.0.52 55541 
> container_1644930832286_0002_01_000002 application_1644930832286_0002 1
> laszlobodor      96789   0.0  1.5  8136944 487980 s001  S     2:13PM   
> 0:06.06 
> /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/bin/java 
> -Xmx819m -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc 
> -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC 
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir=/Users/laszlobodor/apache/tez/tez-tests/target/tmp/org.apache.tez.test.TestSecureShuffle-withoutssl/yarn-466119913/org.apache.tez.test.TestSecureShuffle-withoutssl-logDir-nm-0_0/application_1644930832286_0001/container_1644930832286_0001_01_000002
>  -Dtez.root.logger=INFO,CLA 
> -Djava.io.tmpdir=/Users/laszlobodor/apache/tez/tez-tests/target/tmp/org.apache.tez.test.TestSecureShuffle-withoutssl/yarn-466119913/org.apache.tez.test.TestSecureShuffle-withoutssl-localDir-nm-0_0/usercache/laszlobodor/appcache/application_1644930832286_0001/container_1644930832286_0001_01_000002/tmp
>  org.apache.tez.runtime.task.TezChild 192.168.0.52 55519 
> container_1644930832286_0001_01_000002 application_1644930832286_0001 1
> laszlobodor      96282   0.0  1.4  8258788 474648 s001  S     2:12PM   
> 0:06.69 
> /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/bin/java 
> -Xmx819m -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc 
> -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC 
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir=/Users/laszlobodor/apache/tez/tez-tests/target/tmp/org.apache.tez.test.TestSecureShuffle-withssl/yarn-466021610/org.apache.tez.test.TestSecureShuffle-withssl-logDir-nm-0_0/application_1644930734009_0002/container_1644930734009_0002_01_000002
>  -Dtez.root.logger=INFO,CLA 
> -Djava.io.tmpdir=/Users/laszlobodor/apache/tez/tez-tests/target/tmp/org.apache.tez.test.TestSecureShuffle-withssl/yarn-466021610/org.apache.tez.test.TestSecureShuffle-withssl-localDir-nm-0_0/usercache/laszlobodor/appcache/application_1644930734009_0002/container_1644930734009_0002_01_000002/tmp
>  org.apache.tez.runtime.task.TezChild 192.168.0.52 55452 
> container_1644930734009_0002_01_000002 application_1644930734009_0002 1
> laszlobodor      96129   0.0  1.5  8402248 500904 s001  S     2:12PM   
> 0:06.89 
> /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/bin/java 
> -Xmx819m -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN -XX:+PrintGCDetails -verbose:gc 
> -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC 
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir=/Users/laszlobodor/apache/tez/tez-tests/target/tmp/org.apache.tez.test.TestSecureShuffle-withssl/yarn-466021610/org.apache.tez.test.TestSecureShuffle-withssl-logDir-nm-0_0/application_1644930734009_0001/container_1644930734009_0001_01_000002
>  -Dtez.root.logger=INFO,CLA 
> -Djava.io.tmpdir=/Users/laszlobodor/apache/tez/tez-tests/target/tmp/org.apache.tez.test.TestSecureShuffle-withssl/yarn-466021610/org.apache.tez.test.TestSecureShuffle-withssl-localDir-nm-0_0/usercache/laszlobodor/appcache/application_1644930734009_0001/container_1644930734009_0001_01_000002/tmp
>  org.apache.tez.runtime.task.TezChild 192.168.0.52 55410 
> container_1644930734009_0001_01_000002 application_1644930734009_0001 1
> {code}
> after some investigation, I can see that the issue comes in case of 
> async-http client cases, regardless of ssl/non-ssl or positive/negative 
> testcases



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to