[
https://issues.apache.org/jira/browse/HDFS-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric Caspole updated HDFS-2243:
-------------------------------
Fix Version/s: 0.24.0
Status: Patch Available (was: Open)
> DataXceiver per accept seems to be a bottleneck in HBase/YCSB test
> ------------------------------------------------------------------
>
> Key: HDFS-2243
> URL: https://issues.apache.org/jira/browse/HDFS-2243
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: data-node
> Affects Versions: 0.23.0
> Environment: Using Fedora 14 on a quad core phenom system
> Reporter: Eric Caspole
> Priority: Minor
> Fix For: 0.24.0
>
> Attachments: HDFS-2234-branch-0.20-append.patch,
> HDFS-2243-0.23-110909.txt, datanode-perf-110808.gif
>
>
> I am running the YCSB benchmark against HBase, sometimes against a single
> node, sometimes against a cluster of 6 systems. As the load increases into
> thousands of TPS, especially on the single node, I can see that the datanode
> runs very high system time and seems to be bottlenecked by how fast it can
> create the threads to handle the new connections in DataXceiverServer.run. By
> "perf top" I can see the process spends about 12% of all its time in
> pthread_create, and in hprof profiles I can see there are tens of thousands
> of threads created in just a few minutes of test execution.
> Does anyone else observe this bottleneck? Is there a major challenge to using
> a thread pool of DataXceivers in this situation?
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira