[ 
https://issues.apache.org/jira/browse/HADOOP-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12587285#action_12587285
 ] 

Raghu Angadi commented on HADOOP-3164:
--------------------------------------


The following table shows 'dfs -cat' of 10Gb of data. This is a disk bound test 
and CPU is measured from /proc/pid/stat. io.file.buffer.size is 128k. This is 
cluster with single datanode and the client and datanode are on the same 
machine.  The three fields reported for each run are user cpu, kernel cpu, and 
wall clock time. "total cpu" is sum of user and kernel cpu for DataNode process.

|| Test || Bound by || Run1 || Run 2 || Run 3 || Cpu % || Avg Total Cpu || 
| Trunk | Disk | 2589u 5208k 253s | 2659u 5162k 265s | 2827u 5341k 328s | 100% 
| *7929* |
| Trunk + patch | Disk | 474u 1038k 228s | 477u 1031k 232s | 611u 1189k 301s || 
20% | *1607* |

This shows DataNode takes about 80% less cpu. 

Also, since we don't actually allocate any user buffer, we could actually 
invoke transferTo() to send even larger amounts of data at a time. I haven't 
experimented with larger buffer sizes.

> Use FileChannel.transferTo() when data is read from DataNode.
> -------------------------------------------------------------
>
>                 Key: HADOOP-3164
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3164
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>         Attachments: HADOOP-3164.patch, HADOOP-3164.patch, HADOOP-3164.patch
>
>
> HADOOP-2312 talks about using FileChannel's 
> [{{transferTo()}}|http://java.sun.com/javase/6/docs/api/java/nio/channels/FileChannel.html#transferTo(long,%20long,%20java.nio.channels.WritableByteChannel)]
>  and 
> [{{transferFrom()}}|http://java.sun.com/javase/6/docs/api/java/nio/channels/FileChannel.html#transferFrom(java.nio.channels.ReadableByteChannel,%20long,%20long)]
>  in DataNode. 
> At the time DataNode neither used NIO sockets nor wrote large chunks of 
> contiguous block data to socket. Hadoop 0.17 does both when data is seved to 
> clients (and other datanodes). I am planning to try using transferTo() in the 
> trunk. This might reduce DataNode's cpu by another 50% or more.
> Once HADOOP-1702 is committed, we can look into using transferFrom().

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to