[jira] [Commented] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893218#comment-16893218
 ] 

Hudson commented on HDDS-1830:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16985 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16985/])
HDDS-1830 OzoneManagerDoubleBuffer#stop should wait for daemon thread to (arp7: 
rev b7fba78fb63a0971835db87292822fd8cd4aa7ad)
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java


> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-24 Thread Siyao Meng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892225#comment-16892225
 ] 

Siyao Meng commented on HDDS-1830:
--

Thanks [~bharatviswa]. In a short discussion with [~arp] we further decided to 
make isRunning atomic (though volatile I theory should already be fine but it 
shouldn't hurt to use to atomic). Just posted a PR.

I'm not entirely sure about using try-catch bracket in stop() though. But I 
think it is too much hassle to put throws 

InterruptedException in every parent calls (and someone might want to catch it 
eventually).

I ran TestOzoneManagerDoubleBufferWithOMResponse#testDoubleBuffer locally. It 
is no longer stuck.

> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-22 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890605#comment-16890605
 ] 

Bharat Viswanadham commented on HDDS-1830:
--

Hi [~smeng]

Thanks for jstack. I have understood the root cause for this issue. We did 
interrupt, but to come out of wait(), we need to get lock, as stop() is a 
synchronize method, even if we interrupt, the other thread will not be able to 
come out of that. As the stop() has acquired the lock(as this is also 
synchronized method). I think a simple solution is just to remove synchronize 
from the stop method. I have verified that it is working.

> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-22 Thread Siyao Meng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890515#comment-16890515
 ] 

Siyao Meng commented on HDDS-1830:
--

I tested that if I call join() right after interrupt() in 
OzoneManagerDoubleBuffer#stop(), the thread would wait indefinitely:

{code:title=jstack}
"Thread-2" #14 prio=5 os_prio=31 tid=0x7fee3e997800 nid=0x6003 in 
Object.wait() [0x768fe000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00079591b858> (a org.apache.hadoop.util.Daemon)
at java.lang.Thread.join(Thread.java:1252)
- locked <0x00079591b858> (a org.apache.hadoop.util.Daemon)
at java.lang.Thread.join(Thread.java:1326)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.stop(OzoneManagerDoubleBuffer.java:208)
- locked <0x000795914f98> (a 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer)
at 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.stop(TestOzoneManagerDoubleBufferWithOMResponse.java:90)
at 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:364)
at 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}
Note I took two jstacks 2 minutes apart and get the exact same result.

{code:title=Calling join() in OzoneManagerDoubleBuffer#stop()}
  /**
   * Stop OM DoubleBuffer flush thread.
   */
  public synchronized void stop() {
if (isRunning) {
  LOG.info("Stopping OMDoubleBuffer flush thread");
  isRunning = false;
  daemon.interrupt();

  try {
daemon.join();
  } catch (InterruptedException e) {
e.printStackTrace();
  }
  System.out.println("!!! RETURNED");

  // stop metrics.
  ozoneManagerDoubleBufferMetrics.unRegister();
} else {
  LOG.info("OMDoubleBuffer flush thread is not running.");
}

  }
{code}

I ran unit test TestOzoneManagerDoubleBufferWithOMResponse#testDoubleBuffer 
locally.

> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org