[ 
https://issues.apache.org/jira/browse/HDFS-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977544#action_12977544
 ] 

Jay Booth commented on HDFS-918:
--------------------------------

Hey all, sorry for the slow response, been swamped with the new year and all.

RE: unit tests, at one point it was passing all tests, not sure if the tests 
changed or this changed but I can take a look at it.

RE: 0.23, I can look at forward porting this again, but a lot of changes have 
gone in since then.

@stack, were you testing the "only pooling" patch or the "with full 
multiplexing" patch?  

"Only pooling" would be much simpler to forward port, although I do think that 
the full multiplexing patch is pretty worthwhile.  Aside from the 
small-but-significant performance gain, it was IMO much better factoring to 
have the DN-side logic all encapsulated in a Connection object which has 
sendPacket() repeatedly called, rather than a giant procedural loop that goes 
down and back up through several classes.  The architecture also made keepalive 
pretty straightforward.. just throw that connection back into a listening pool 
when done, and make corresponding changes on client side.  But, I guess that 
logic's been revised now anyways, so it'd be a significant piece of work to 
bring it all back up to date.

> Use single Selector and small thread pool to replace many instances of 
> BlockSender for reads
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-918
>                 URL: https://issues.apache.org/jira/browse/HDFS-918
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Jay Booth
>            Assignee: Jay Booth
>             Fix For: 0.22.0
>
>         Attachments: hbase-hdfs-benchmarks.ods, hdfs-918-20100201.patch, 
> hdfs-918-20100203.patch, hdfs-918-20100211.patch, hdfs-918-20100228.patch, 
> hdfs-918-20100309.patch, hdfs-918-branch20-append.patch, 
> hdfs-918-branch20.2.patch, hdfs-918-pool.patch, hdfs-918-TRUNK.patch, 
> hdfs-multiplex.patch
>
>
> Currently, on read requests, the DataXCeiver server allocates a new thread 
> per request, which must allocate its own buffers and leads to 
> higher-than-optimal CPU and memory usage by the sending threads.  If we had a 
> single selector and a small threadpool to multiplex request packets, we could 
> theoretically achieve higher performance while taking up fewer resources and 
> leaving more CPU on datanodes available for mapred, hbase or whatever.  This 
> can be done without changing any wire protocols.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to