[ 
https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855205#action_12855205
 ] 

bc Wong commented on HDFS-941:
------------------------------

I replaced the size-of-one cache with a more generic cache, which is also a 
global shared cache. There is a new TestParallelRead, which test the concurrent 
use of a DFSInputStream with concurrent readers. There's a clear speed 
difference with vs without the patch. Each thread does 1024 # of reads.

Trunk:
{noformat}
Report: 4 threads read 236953 KB (across 1 file(s)) in 5.879s; average 
40304.98384078925 KB/s
Report: 4 threads read 238873 KB (across 1 file(s)) in 5.063s; average 
47180.13035749556 KB/s
Report: 4 threads read 236068 KB (across 1 file(s)) in 5.93s; average 
39809.10623946037 KB/s
Report: 16 threads read 942666 KB (across 1 file(s)) in 13.524s; average 
69703.19432120674 KB/s
Report: 16 threads read 947015 KB (across 1 file(s)) in 13.401s; average 
70667.48750093277 KB/s
Report: 16 threads read 948768 KB (across 1 file(s)) in 12.932s; average 
73365.91401175379 KB/s
Report: 8 threads read 469529 KB (across 2 file(s)) in 5.436s; average 
86373.98822663723 KB/s
Report: 8 threads read 455428 KB (across 2 file(s)) in 5.363s; average 
84920.38038411336 KB/s
Report: 8 threads read 469005 KB (across 2 file(s)) in 5.713s; average 
82094.34622790127 KB/s
{noformat}

Patched:
{noformat}
Report: 4 threads read 236845 KB (across 1 file(s)) in 3.612s; average 
65571.70542635658 KB/s
Report: 4 threads read 238803 KB (across 1 file(s)) in 4.371s; average 
54633.49347975291 KB/s
Report: 4 threads read 240241 KB (across 1 file(s)) in 4.395s; average 
54662.34357224119 KB/s
Report: 16 threads read 938652 KB (across 1 file(s)) in 9.044s; average 
103787.26227333037 KB/s
Report: 16 threads read 943999 KB (across 1 file(s)) in 8.59s; average 
109895.11059371362 KB/s
Report: 16 threads read 938546 KB (across 1 file(s)) in 9.081s; average 
103352.71445876005 KB/s
Report: 8 threads read 478534 KB (across 2 file(s)) in 3.376s; average 
141745.85308056872 KB/s
Report: 8 threads read 467412 KB (across 2 file(s)) in 3.623s; average 
129012.42064587357 KB/s
Report: 8 threads read 475349 KB (across 2 file(s)) in 3.49s; average 
136203.15186246418 KB/s
{noformat}

bq. The edits to the docs in DataNode.java are good - if possible they should 
probably move into HDFS-1001 though, no?
The addition to the docs doesn't apply to HDFS-1001, in which the DataXceiver 
still actively closes all sockets after each use.

Todd, the new patch addresses the rest of your comments.


> Datanode xceiver protocol should allow reuse of a connection
> ------------------------------------------------------------
>
>                 Key: HDFS-941
>                 URL: https://issues.apache.org/jira/browse/HDFS-941
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: bc Wong
>         Attachments: HDFS-941-1.patch, HDFS-941-2.patch
>
>
> Right now each connection into the datanode xceiver only processes one 
> operation.
> In the case that an operation leaves the stream in a well-defined state (eg a 
> client reads to the end of a block successfully) the same connection could be 
> reused for a second operation. This should improve random read performance 
> significantly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to