[ 
https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494594#comment-13494594
 ] 

LiuLei commented on HDFS-3429:
------------------------------

I say my understand for this problem, there are two purposes the DN need to 
read checksum form meta file.
1. Server need to verify checksum, example Block scanner.
2. DFSClient need to verify checksum, in te case, DN read checksum but don't 
verify checnk,  instead , DN send checksum to DFSClient, DFSClient verify 
checksum.

So we need to two parameters to indicate the two purposes.
1. Constructor of BlockSender class has contained one verifyChecksum parameter, 
that can represent Server whether verify checksum.
2. FileSystem.setVerifyChecksum(boolean verifyChecksum) method can represent 
DFSClient whether verify checksum, so we need to send the parameter value to 
DN, and add one isClientVerifyChecksum parameter in BlockSender constructor。

If verifyChecksum and isClientVerifyChecksum parameters all are false, DN don't 
need to read checksum, and only need to send data to client, in the case, we 
only need to create one DataChecksum.CHECKSUM_NULL instance, the instance can 
guarantee DN don't read checksum form meta file(because the checksumSize of the 
DataChecksum.CHECKSUM_NULL instance is 0).


The patch I commit contain these modifies. 



 

 

                
> DataNode reads checksums even if client does not need them
> ----------------------------------------------------------
>
>                 Key: HDFS-3429
>                 URL: https://issues.apache.org/jira/browse/HDFS-3429
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, performance
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3429-0.20.2.patch, hdfs-3429.txt, hdfs-3429.txt
>
>
> Currently, even if the client does not want to verify checksums, the datanode 
> reads them anyway and sends them over the wire. This means that performance 
> improvements like HBase's application-level checksums don't have much benefit 
> when reading through the datanode, since the DN is still causing seeks into 
> the checksum file.
> (Credit goes to Dhruba for discovering this - filing on his behalf)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to