DFS write pipeline : only the last datanode needs to verify checksum
--------------------------------------------------------------------
Key: HADOOP-3328
URL: https://issues.apache.org/jira/browse/HADOOP-3328
Project: Hadoop Core
Issue Type: Improvement
Components: dfs
Affects Versions: 0.16.0
Reporter: Raghu Angadi
Currently all the datanodes in DFS write pipeline verify checksum. Since the
current protocol includes acks from the datanodes, an ack from the last node
could also serve as verification that checksum ok. In that sense, only the last
datanode needs to verify checksum. Based on [this
comment|http://issues.apache.org/jira/browse/HADOOP-1702?focusedCommentId=12575553#action_12575553]
from HADOOP-1702, CPU consumption might go down by another 25-30% (4/14) after
HADOOP-1702.
Also this would make it easier to use transferTo() and transferFrom() on
intermediate datanodes since they don't need to look at the data.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.