[
https://issues.apache.org/jira/browse/HDFS-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097207#comment-15097207
]
Yongjun Zhang commented on HDFS-9612:
-------------------------------------
Thanks for [~jojochuang]'s work here and [~3opan] for the review.
Overall the patch looks good. I have the following comments:
# About the following code
{code}
executor.shutdown();
executor.shutdownNow();
{code}
it looks a bit werid to me. Instead of calling two methods, why not just call
{{ executor.shutdownNow();}}?
# Agree with Zoran that separating the log4j change to a different jira would
be better.
# About
{code}
try {
work = inputQueue.take();
} catch (InterruptedException e) {
LOG.debug("Interrupted while waiting for request from inputQueue.");
// if interrupt is triggered by shutdown(), terminate the thread
// otherwise, attempt to take again
Thread.currentThread().interrupt();
return;
}
boolean isDone = false;
while (!isDone) {
try {
// assume processor.processItem() is stateless
WorkReport<R> result = processor.processItem(work);
outputQueue.put(result);
isDone = true;
} catch (InterruptedException ie) {
LOG.debug("Could not put report into outputQueue. Retrying...");
}
}
{code}
## The call to {{Thread.currentThread().interrupt();}} can be dropped
## If I understand it correctly, the comment "if interrupt is triggered by
shutdown(), terminate the thread; otherwise, attempt to take again" can be
improved. such as "If interrupt happens when taking work out from queue, then
the interrupt is likely triggered by the shutdown() call, exit the thread; if
the interrupt happens while the work is being processed, go back to process the
same work again."
## The message ""Could not put report into outputQueue" is not accurate since
interrupt can be triggered from either within processItem or put operation.
## Add javadoc to this method and probably even the class itself to say that it
assumes " processor.processItem() is stateless"
# About the test, would you please put some comment to indicate how the test
would fail and with the fix it won't fail?
Thanks.
> DistCp worker threads are not terminated after jobs are done.
> -------------------------------------------------------------
>
> Key: HDFS-9612
> URL: https://issues.apache.org/jira/browse/HDFS-9612
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: distcp
> Affects Versions: 2.8.0
> Reporter: Wei-Chiu Chuang
> Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9612.001.patch, HDFS-9612.002.patch,
> HDFS-9612.003.patch, HDFS-9612.004.patch, HDFS-9612.005.patch,
> HDFS-9612.006.patch
>
>
> In HADOOP-11827, a producer-consumer style thread pool was introduced to
> parallelize the task of listing files/directories.
> We have a use case where a distcp job is run during the commit phase of a MR2
> job. However, it was found distcp does not terminate ProducerConsumer thread
> pools properly. Because threads are not terminated, those MR2 jobs never
> finish.
> In a more typical use case where distcp is run as a standalone job, those
> threads are terminated forcefully when the java process is terminated. So
> these leaked threads did not become a problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)