[
https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HDFS-941:
-----------------------------
Attachment: hdfs-941.txt
Updated patch to fix failing tests:
- TestDFSClientRetries works by setting the max xceiver count to something very
low like 2, and then hammering it with a lot of clients, to make sure the
randomized backoff lets them all eventually succeed. With even a short
keepalive on the datanode side, the transceivers were occupied for too long.
Set the DN keepalive config to 0 for this test case, and modified the DN code
so that a config setting 0 disables the behavior.
- TestNameNodeMetrics was looking at the cluster "load" (read: xceiver count)
as one of the metrics. This was therefore sensitive to timing since it
dependended on whether the DN heartbeated during the keepalive window or after
it had expired. I removed this assert since the other metrics already do good
coverage.
> Datanode xceiver protocol should allow reuse of a connection
> ------------------------------------------------------------
>
> Key: HDFS-941
> URL: https://issues.apache.org/jira/browse/HDFS-941
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: data-node, hdfs client
> Affects Versions: 0.22.0
> Reporter: Todd Lipcon
> Assignee: bc Wong
> Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch,
> HDFS-941-3.patch, HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.patch,
> HDFS-941-6.patch, HDFS-941-6.patch, hdfs-941.txt, hdfs-941.txt, hdfs941-1.png
>
>
> Right now each connection into the datanode xceiver only processes one
> operation.
> In the case that an operation leaves the stream in a well-defined state (eg a
> client reads to the end of a block successfully) the same connection could be
> reused for a second operation. This should improve random read performance
> significantly.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira