[ 
https://issues.apache.org/jira/browse/HADOOP-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lohit Vijayarenu updated HADOOP-3328:
-------------------------------------

     Description: 
Currently all the datanodes in DFS write pipeline verify checksum. Since the 
current protocol includes acks from  the datanodes, an ack from the last node 
could also serve as verification that checksum ok. In that sense, only the last 
datanode needs to verify checksum. Based on [this 
comment|http://issues.apache.org/jira/browse/HADOOP-1702?focusedCommentId=12575553#action_12575553]
 from HADOOP-1702, CPU consumption might go down by another 25-30% (4/14) after 
HADOOP-1702. 

Also this would make it easier to use transferTo() and transferFrom() on 
intermediate datanodes since they don't need to look at the data.

  was:

Currently all the datanodes in DFS write pipeline verify checksum. Since the 
current protocol includes acks from  the datanodes, an ack from the last node 
could also serve as verification that checksum ok. In that sense, only the last 
datanode needs to verify checksum. Based on [this 
comment|http://issues.apache.org/jira/browse/HADOOP-1702?focusedCommentId=12575553#action_12575553]
 from HADOOP-1702, CPU consumption might go down by another 25-30% (4/14) after 
HADOOP-1702. 

Also this would make it easier to use transferTo() and transferFrom() on 
intermediate datanodes since they don't need to look at the data.

    Hadoop Flags: [Reviewed]

+1 Looks good. 

> DFS write pipeline : only the last datanode needs to verify checksum
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3328
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3328
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>         Attachments: HADOOP-3328.patch, HADOOP-3328.patch
>
>
> Currently all the datanodes in DFS write pipeline verify checksum. Since the 
> current protocol includes acks from  the datanodes, an ack from the last node 
> could also serve as verification that checksum ok. In that sense, only the 
> last datanode needs to verify checksum. Based on [this 
> comment|http://issues.apache.org/jira/browse/HADOOP-1702?focusedCommentId=12575553#action_12575553]
>  from HADOOP-1702, CPU consumption might go down by another 25-30% (4/14) 
> after HADOOP-1702. 
> Also this would make it easier to use transferTo() and transferFrom() on 
> intermediate datanodes since they don't need to look at the data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to