[
https://issues.apache.org/jira/browse/HDFS-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120426#comment-14120426
]
Colin Patrick McCabe commented on HDFS-6867:
--------------------------------------------
A few comments on the first version (pre-rebase):
{code}
+ DFSClient.LOG.debug("RecoveryWorker for block " + block + " running");
{code}
This should be INFO, and I would also include the client name and file name.
{code}
+// private boolean atRisk = false;
...
+// this.atRisk = true;
{code}
I think you meant to remove this.
Please don't use Thread.interrupt on the RecoveryWorker when closing. It gets
very messy (Thread.interrupt can close sockets and FDs that the thread is
listening on, etc.) Instead, use some kind of Atomic or condition variable to
communicate with this thread.
{code}
+ underBackgroundRecovery = true;
+ addDatanode2ExistingPipeline();
+ pipelineRecovered = true;
+ underBackgroundRecovery = false;
{code}
This will leave you in a bad state if there is an exception or other unexpected
control flow issue.
We should think about how to manage the communication between the writer thread
and recovery thread. There are a few canonical ways:
* locks plus condition variables
* AtomicReference with a state, implementing a state machine
* locks plus polling
volatile booleans are almost always a mistake.
> For DFSOutputStream, do pipeline recovery for a single block in the background
> ------------------------------------------------------------------------------
>
> Key: HDFS-6867
> URL: https://issues.apache.org/jira/browse/HDFS-6867
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Reporter: Colin Patrick McCabe
> Assignee: Zhe Zhang
> Attachments: HDFS-6867-20140827-2.patch, HDFS-6867-20140827-3.patch,
> HDFS-6867-20140827.patch, HDFS-6867-20140828-1.patch,
> HDFS-6867-20140828-2.patch, HDFS-6867-design-20140820.pdf,
> HDFS-6867-design-20140821.pdf, HDFS-6867-design-20140822.pdf,
> HDFS-6867-design-20140827.pdf
>
>
> For DFSOutputStream, we should be able to do pipeline recovery in the
> background, while the user is continuing to write to the file. This is
> especially useful for long-lived clients that write to an HDFS file slowly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)