[ 
https://issues.apache.org/jira/browse/HDDS-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890515#comment-16890515
 ] 

Siyao Meng commented on HDDS-1830:
----------------------------------

I tested that if I call join() right after interrupt() in 
OzoneManagerDoubleBuffer#stop(), the thread would wait indefinitely:

{code:title=jstack}
"Thread-2" #14 prio=5 os_prio=31 tid=0x00007fee3e997800 nid=0x6003 in 
Object.wait() [0x00007000068fe000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x000000079591b858> (a org.apache.hadoop.util.Daemon)
        at java.lang.Thread.join(Thread.java:1252)
        - locked <0x000000079591b858> (a org.apache.hadoop.util.Daemon)
        at java.lang.Thread.join(Thread.java:1326)
        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.stop(OzoneManagerDoubleBuffer.java:208)
        - locked <0x0000000795914f98> (a 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer)
        at 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.stop(TestOzoneManagerDoubleBufferWithOMResponse.java:90)
        at 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:364)
        at 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:104)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}
Note I took two jstacks 2 minutes apart and get the exact same result.

{code:title=Calling join() in OzoneManagerDoubleBuffer#stop()}
  /**
   * Stop OM DoubleBuffer flush thread.
   */
  public synchronized void stop() {
    if (isRunning) {
      LOG.info("Stopping OMDoubleBuffer flush thread");
      isRunning = false;
      daemon.interrupt();

      try {
        daemon.join();
      } catch (InterruptedException e) {
        e.printStackTrace();
      }
      System.out.println("!!! RETURNED");

      // stop metrics.
      ozoneManagerDoubleBufferMetrics.unRegister();
    } else {
      LOG.info("OMDoubleBuffer flush thread is not running.");
    }

  }
{code}

I ran unit test TestOzoneManagerDoubleBufferWithOMResponse#testDoubleBuffer 
locally.

> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> ------------------------------------------------------------------
>
>                 Key: HDDS-1830
>                 URL: https://issues.apache.org/jira/browse/HDDS-1830
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: Hanisha Koneru
>            Assignee: Siyao Meng
>            Priority: Major
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to