[
https://issues.apache.org/jira/browse/HADOOP-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602222#action_12602222
]
Hadoop QA commented on HADOOP-3113:
-----------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12383353/tmpFile.patch
against trunk revision 662976.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 6 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac
compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of
release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results:
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2566/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2566/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results:
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2566/artifact/trunk/build/test/checkstyle-errors.html
Console output:
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2566/console
This message is automatically generated.
> DFSOututStream.flush() should flush data to real block file on DataNode.
> ------------------------------------------------------------------------
>
> Key: HADOOP-3113
> URL: https://issues.apache.org/jira/browse/HADOOP-3113
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: noTmpFile.patch, noTmpFile.patch, tmpFile.patch,
> tmpFile.patch, tmpFile.patch, tmpFile.patch
>
>
> DFSOutputStream has a method called flush() that persists block locations on
> the namenode and sends all outstanding data to all datanodes in the pipeline.
> However, this data goes to the tmp file on the datanode(s). When the block is
> closed, the tmp files is renamed to be the real block file. If the
> datanode(s) dies before the block is compete, then entire block is lost. This
> behaviour wil be fixed in HADOOP-1700.
> However, in the short term, a configuration paramater can be used to allow
> datanodes to write to the real block file directly, thereby avoiding writing
> to the tmp file. This means that data that is flushed successfully by a
> client does not get lost even if the datanode(s) or client dies.
> The Namenode already has code to pick the largest replica (if multiple
> datanodes have different sizes of this block). Also, the namenode has code to
> not trigger replication request if the file is still being written to.
> The only caveat that I can think of is that the block report periodicity
> should be much much smaller that the lease timeout period. A block report
> adds the being-written-to blocks to the blocksMap thereby avoiding any
> cleanup that a lease expiry processing might have otherwise done.
> Not all requirements specified by HADOOP-1700 are supported by this approach,
> but it could still be helpful (in the short term) for a wide range of
> applications.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.