[ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044929#comment-14044929
 ] 

Chris Nauroth commented on HDFS-6591:
-------------------------------------

[~liulei.cn], thank you for the bug report.

[~xieliang007], nice work on the patch.  This is a pretty tricky one!  I think 
the logic is easier to follow now using {{CompletionService}} instead of 
{{CountDownLatch}}.

I have a few comments and questions.

# {{DFSInputStream#hedgedFetchBlockByteRange}}: I have a question on the 
following code, which runs when there is no need for a hedged read.  (The 
initial read completes before the hedged read timeout.)
{code}
          future = hedgedService.poll(dfsClient.getHedgedReadTimeout(),
              TimeUnit.MILLISECONDS);
          if (future != null) {
            future.get();
            return;
          }
{code}
The return value from {{future.get()}} is discarded, so it appears that we 
never copy to {{buf}} in this code path.  I only see a copy to {{buf}} inside 
the else path, when a hedged read actually happens.  Did I miss something?
# {{DFSHedgedReadMetrics}}: Is it actually useful to add the new metric for 
HedgedReadOpsLoopNum, or was it just helpful for catching this bug?  If it's 
not going to be useful in general, then maybe we should look for a different 
way to accommodate the test rather than publishing a new metric.
# {{TestPread}}: Let's set {{isHedgedRead = false}} in a JUnit {{After}} method 
instead of at the end of individual methods.  That way, if one test throws an 
exception and leaves the flag on by mistake, it won't disrupt the other tests 
in the same suite.  Also, it seems this was done wrong for 
{{testMaxOutHedgedReadPool}}, which is setting it to {{true}} at the end.  
Using an {{After}} method would prevent those kinds of problems.
# {{TestPread#testHedgedReadLoopTooManyTimes}}: Let's guarantee that {{output}} 
and {{input}} get closed.  We can put the cleanup in a finally block and call 
{{IOUtils#cleanup}}.

> while loop is executed tens of thousands of times  in Hedged  Read
> ------------------------------------------------------------------
>
>                 Key: HDFS-6591
>                 URL: https://issues.apache.org/jira/browse/HDFS-6591
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.4.0
>            Reporter: LiuLei
>            Assignee: Liang Xie
>         Attachments: HDFS-6591-v2.txt, HDFS-6591.txt, 
> LoopTooManyTimesTestCase.patch
>
>
> I download hadoop-2.4.1-rc1 code from 
> http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
> Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
> tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to