[ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15287275#comment-15287275
 ] 

Kai Zheng commented on HDFS-9833:
---------------------------------

The big patch looks pretty good. Thanks Rakesh! Some minor comments so far 
according to a quick look.

* Unexpected change in PBHelperClient?
{code}
-    case ENTERING_MAINTENANCE:
-      return DatanodeInfoProto.AdminState.ENTERING_MAINTENANCE;
-    case IN_MAINTENANCE:
-      return DatanodeInfoProto.AdminState.IN_MAINTENANCE;
{code}

* Good idea to have {{StripedBlockReconstructor}} and 
{{StripedBlockChecksumReconstructor}} by extending {{StripedReconstructor}}. 
For StripedBlockChecksumReconstructor, the name of {{md5Writer}} may be renamed 
to a general one like {{checksumWriter}}? And {{reconstructAndTransfer}} could 
be {{reconstruct}} or {{reconstructChecksum}} as no transferring will happen 
here.

* I thought the original main comments in StripedReconstructor would be better 
to remain there because the rough idea still applies to the common base and can 
be shared by the both subclasses.

Look forward to the formal patch!

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-9833
>                 URL: https://issues.apache.org/jira/browse/HDFS-9833
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Rakesh R
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-9833-00-draft.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to