[
https://issues.apache.org/jira/browse/HDFS-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476435#comment-13476435
]
Aaron T. Myers commented on HDFS-4049:
--------------------------------------
Patch looks really good to me. A few little comments on the test code. +1 once
these are addressed:
# In TestMultiThreadedHflush, the repl 1 test is in fact also using repl 3:
{code}
+ public void testMultipleHflushersRepl1() throws Exception {
+ doTestMultipleHflushers(3);
+ }
+
+ @Test
+ public void testMultipleHflushersRepl3() throws Exception {
+ doTestMultipleHflushers(3);
+ }
{code}
# I recommend changing the formal parameter name to "replication" or
"numDatanodes" instead of "i" in the following:
{code}
+ private void doTestMultipleHflushers(int i) throws Exception {
{code}
# Not changed by your patch, but I think that the constants
TestMultiThreadedHflush.numBlocks and TestMultiThreadedHflush.fileSize are
unused. Mind removing them?
In addition to the code review, I also tested the patch on a 4 node physical
cluster and confirmed it worked as expected.
Prior to HDFS-3721:
{noformat}
Finished in 58757ms
Latency quantiles (in microseconds):
50.00 %ile +/- 5.00%: 847
75.00 %ile +/- 2.50%: 954
90.00 %ile +/- 1.00%: 1074
95.00 %ile +/- 0.50%: 1203
99.00 %ile +/- 0.10%: 2073
{noformat}
Post HDFS-3721:
{noformat}
Finished in 1032220ms
Latency quantiles (in microseconds):
50.00 %ile +/- 5.00%: 1756
75.00 %ile +/- 2.50%: 41004
90.00 %ile +/- 1.00%: 41231
95.00 %ile +/- 0.50%: 41400
99.00 %ile +/- 0.10%: 42684
{noformat}
With the latest patch included here:
{noformat}
Finished in 58531ms
Latency quantiles (in microseconds):
50.00 %ile +/- 5.00%: 864
75.00 %ile +/- 2.50%: 970
90.00 %ile +/- 1.00%: 1096
95.00 %ile +/- 0.50%: 1237
99.00 %ile +/- 0.10%: 2131
{noformat}
> hflush performance regression due to nagling delays
> ---------------------------------------------------
>
> Key: HDFS-4049
> URL: https://issues.apache.org/jira/browse/HDFS-4049
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: data-node, performance
> Affects Versions: 3.0.0, 2.0.2-alpha
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Critical
> Attachments: hdfs-4049.txt, hdfs-4049.txt
>
>
> HDFS-3721 reworked the way that packets are mirrored through the pipeline in
> the datanode. This caused two write() calls where there used to be one, which
> interacts badly with nagling so that there are 40ms bubbles on hflush()
> calls. We didn't notice this in the tests because the hflush perf test only
> uses a single datanode.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira