[ 
https://issues.apache.org/jira/browse/HADOOP-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595354#action_12595354
 ] 

Raghu Angadi commented on HADOOP-1702:
--------------------------------------

Thanks for the review Hairong.

- #1 :  interesting suggestion. It would be a change in protocol that affects 
other datanode transfers like reading etc. We rarely send partial packets 
(mainly fsync). Also checksum data is less than one percent of the packet. I 
hope it is ok for this patch. Note that this is not a memory copy that did not 
exist before (previously all the data was copied).

- #2: Yes it is a change from previous behavior. Before this patch it didn't 
matter since we handled 512 bytes at a time. The receiving datanode verifies 
the checksum anyway. Checking checksum after tunneling data downstream 
(theoretically) reduces latency. This is the same reason datanode first sends 
the data to mirror and then stores the data locally.

- #3. sure.
- #4. yes.

> Reduce buffer copies when data is written to DFS
> ------------------------------------------------
>
>                 Key: HADOOP-1702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1702
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.0
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-1702.patch, HADOOP-1702.patch, HADOOP-1702.patch, 
> HADOOP-1702.patch, HADOOP-1702.patch, HADOOP-1702.patch
>
>
> HADOOP-1649 adds extra buffering to improve write performance.  The following 
> diagram shows buffers as pointed by (numbers). Each eatra buffer adds an 
> extra copy since most of our read()/write()s match the io.bytes.per.checksum, 
> which is much smaller than buffer size.
> {noformat}
>        (1)                 (2)          (3)                 (5)
>    +---||----[ CLIENT ]---||----<>-----||---[ DATANODE ]---||--<>-> to Mirror 
>  
>    | (buffer)                  (socket)           |  (4)
>    |                                              +--||--+
>  =====                                                    |
>  =====                                                  =====
>  (disk)                                                 =====
> {noformat}
> Currently loops that read and write block data, handle one checksum chunk at 
> a time. By reading multiple chunks at a time, we can remove buffers (1), (2), 
> (3), and (5). 
> Similarly some copies can be reduced when clients read data from the DFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to