[jira] [Commented] (HDFS-9612) DistCp worker threads are not terminated after jobs are done.

Yongjun Zhang (JIRA) Wed, 13 Jan 2016 14:55:02 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097207#comment-15097207
 ]


Yongjun Zhang commented on HDFS-9612:
-------------------------------------

Thanks for [~jojochuang]'s work here and [~3opan] for the review.

Overall the patch looks good. I have the following comments:

# About the following code
{code}
    executor.shutdown();
    executor.shutdownNow();
{code}
it looks a bit werid to me. Instead of calling two methods, why not just call 
{{ executor.shutdownNow();}}?
# Agree with Zoran that separating the log4j change to a different jira would 
be better.
# About
{code}
        try {
          work = inputQueue.take();
        } catch (InterruptedException e) {
          LOG.debug("Interrupted while waiting for request from inputQueue.");
          // if interrupt is triggered by shutdown(), terminate the thread
          // otherwise, attempt to take again
          Thread.currentThread().interrupt();
          return;
        }

        boolean isDone = false;
        while (!isDone) {
          try {
            // assume processor.processItem() is stateless
            WorkReport<R> result = processor.processItem(work);
            outputQueue.put(result);
            isDone = true;
          } catch (InterruptedException ie) {
            LOG.debug("Could not put report into outputQueue. Retrying...");
          }
        }
{code}
##  The call to  {{Thread.currentThread().interrupt();}} can be dropped
##  If I understand it correctly, the comment "if interrupt is triggered by 
shutdown(), terminate the thread; otherwise, attempt to take again" can be 
improved. such as "If interrupt  happens when taking work out from queue, then 
the interrupt is likely triggered by the shutdown() call, exit the thread; if 
the interrupt happens while the work is being processed, go back to process the 
same work again."
## The message ""Could not put report into outputQueue" is not accurate since 
interrupt can be triggered from either within processItem or put operation.
## Add javadoc to this method and probably even the class itself to say that it 
assumes " processor.processItem() is stateless"
# About the test, would you please put some comment to indicate how the test 
would fail and with the fix it won't fail?

Thanks.





> DistCp worker threads are not terminated after jobs are done.
> -------------------------------------------------------------
>
>                 Key: HDFS-9612
>                 URL: https://issues.apache.org/jira/browse/HDFS-9612
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: distcp
>    Affects Versions: 2.8.0
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>         Attachments: HDFS-9612.001.patch, HDFS-9612.002.patch, 
> HDFS-9612.003.patch, HDFS-9612.004.patch, HDFS-9612.005.patch, 
> HDFS-9612.006.patch
>
>
> In HADOOP-11827, a producer-consumer style thread pool was introduced to 
> parallelize the task of listing files/directories.
> We have a use case where a distcp job is run during the commit phase of a MR2 
> job. However, it was found distcp does not terminate ProducerConsumer thread 
> pools properly. Because threads are not terminated, those MR2 jobs never 
> finish.
> In a more typical use case where distcp is run as a standalone job, those 
> threads are terminated forcefully when the java process is terminated. So 
> these leaked threads did not become a problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9612) DistCp worker threads are not terminated after jobs are done.

Reply via email to