[ 
https://issues.apache.org/jira/browse/HADOOP-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575553#action_12575553
 ] 

rangadi edited comment on HADOOP-1702 at 3/5/08 8:25 PM:
--------------------------------------------------------------

Test results show *30%* improvement in DataNode CPU with the patch. I think it 
makes sense. Based on the picture above before this patch, with replication of 
3, the data is copied 6 + 6 + 4 times and with this patch it is 3 + 3 + 2. Each 
of these datanodes verify CRC. Approximating cost of checksumming to be twice 
that of a memory copy, we get (8+6)/(14+6) == 70%. If we increase the size of 
checksum chunk, cost of CRC will go down. It is be 68% with a factor of 1.5 for 
CRC.

Test Setup : three instances of 'dd if=/dev/zero 4Gb | hadoop -put - 4Gb'. More 
importantly, DataNode was modified to write the deta to '/dev/null' instead of 
the block file. Otherwise I could not isolate the test from disk activity. The 
cluster has 3 datanodes. The clients, Namenode, and datanodes are all running 
on the same node. The test was CPU bound.

CPU measurement : Linux reports a process' cpu in /proc/pid/stat : 14th entry 
is user cpu and 15th is kernel cpu. I think these are specified in  jiffies. 
Like most things with Linux kernel, these are approximations but reasonably 
dependable in large numbers.   The numbers reported are sum of cpu for each of 
the three datanodes.

below: 'u' and 'k' are user and kernel cpu in thousands of jiffies.

|| Test || Run 1 || Run 2 || Run 3 || Avg Total Cpu || Avg Time ||
| Trunk* | 8.60u 2.52k 372s | 8.36u 2.48k 368s |  8.39u 2.40k 368s | *10.95*  | 
369s |
| Trunk + patch* | 5.61u 2.22k 289s | 5.38u 2.16k 296s | 5.57u 2.25k 289s | 
*7.73 (70%)*| 291s (79%) |
{{*}} : datanodes write data to /dev/null.

Currently, DFSIO benchmark shows dip in write b/w. I am still looking into it.



      was (Author: rangadi):
    
Test results show *30%* improvement in DataNode CPU with the patch. I think it 
makes sense. Based on the picture above before this patch, with replication of 
3, the data is copied 6 + 6 + 4 times and with this patch it is 3 + 3 + 2. Each 
of these datanodes verify CRC. Approximating cost of checksumming to be twice 
that of a memory copy, we get (8+6)/(14+6) == 70%. If we increase the size of 
checksum chunk, cost of CRC will go down. It will be 68% with a factor of 1.5 
for CRC.

Test Setup : three instances of 'dd if=/dev/zero 4Gb | hadoop -put - 4Gb'. More 
importantly, DataNode was modified to write the deta to '/dev/null' instead of 
the block file. Otherwise I could not isolate the test from disk activity. The 
cluster has 3 datanodes. The clients, Namenode, and datanodes are all running 
on the same node. The test was CPU bound.

CPU measurement : Linux reports a process' cpu in /proc/pid/stat : 14th entry 
is user cpu and 15th is kernel cpu. I think these are specified in  jiffies. 
Like most things with Linux kernel, these are approximations but reasonably 
dependable in large numbers.

below: 'u' and 'k' are user and kernel cpu in thousands of jiffies.

|| Test || Run 1 || Run 2 || Run 3 || Avg Total Cpu || Avg Time ||
| Trunk* | 8.60u 2.52k 372s | 8.36u 2.48k 368s |  8.39u 2.40k 368s | *10.95*  | 
369s |
| Trunk + patch* | 5.61u 2.22k 289s | 5.38u 2.16k 296s | 5.57u 2.25k 289s | 
*7.73 (70%)*| 291s (79%) |
{{*}} : datanodes write data to /dev/null.

Currently, DFSIO benchmark shows dip in write b/w. I am still looking into it.


  
> Reduce buffer copies when data is written to DFS
> ------------------------------------------------
>
>                 Key: HADOOP-1702
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1702
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.0
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-1702.patch
>
>
> HADOOP-1649 adds extra buffering to improve write performance.  The following 
> diagram shows buffers as pointed by (numbers). Each eatra buffer adds an 
> extra copy since most of our read()/write()s match the io.bytes.per.checksum, 
> which is much smaller than buffer size.
> {noformat}
>        (1)                 (2)          (3)                 (5)
>    +---||----[ CLIENT ]---||----<>-----||---[ DATANODE ]---||--<>-> to Mirror 
>  
>    | (buffer)                  (socket)           |  (4)
>    |                                              +--||--+
>  =====                                                    |
>  =====                                                  =====
>  (disk)                                                 =====
> {noformat}
> Currently loops that read and write block data, handle one checksum chunk at 
> a time. By reading multiple chunks at a time, we can remove buffers (1), (2), 
> (3), and (5). 
> Similarly some copies can be reduced when clients read data from the DFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to