[
https://issues.apache.org/jira/browse/HDDS-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890515#comment-16890515
]
Siyao Meng commented on HDDS-1830:
----------------------------------
I tested that if I call join() right after interrupt() in
OzoneManagerDoubleBuffer#stop(), the thread would wait indefinitely:
{code:title=jstack}
"Thread-2" #14 prio=5 os_prio=31 tid=0x00007fee3e997800 nid=0x6003 in
Object.wait() [0x00007000068fe000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000079591b858> (a org.apache.hadoop.util.Daemon)
at java.lang.Thread.join(Thread.java:1252)
- locked <0x000000079591b858> (a org.apache.hadoop.util.Daemon)
at java.lang.Thread.join(Thread.java:1326)
at
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.stop(OzoneManagerDoubleBuffer.java:208)
- locked <0x0000000795914f98> (a
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer)
at
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.stop(TestOzoneManagerDoubleBufferWithOMResponse.java:90)
at
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:364)
at
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}
Note I took two jstacks 2 minutes apart and get the exact same result.
{code:title=Calling join() in OzoneManagerDoubleBuffer#stop()}
/**
* Stop OM DoubleBuffer flush thread.
*/
public synchronized void stop() {
if (isRunning) {
LOG.info("Stopping OMDoubleBuffer flush thread");
isRunning = false;
daemon.interrupt();
try {
daemon.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("!!! RETURNED");
// stop metrics.
ozoneManagerDoubleBufferMetrics.unRegister();
} else {
LOG.info("OMDoubleBuffer flush thread is not running.");
}
}
{code}
I ran unit test TestOzoneManagerDoubleBufferWithOMResponse#testDoubleBuffer
locally.
> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> ------------------------------------------------------------------
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
> Issue Type: Sub-task
> Reporter: Hanisha Koneru
> Assignee: Siyao Meng
> Priority: Major
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls
> interrupt() on daemon thread but not join(). The thread might still be
> running when the call returns.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]