[ 
https://issues.apache.org/jira/browse/HDFS-11303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047945#comment-16047945
 ] 

Wei-Chiu Chuang commented on HDFS-11303:
----------------------------------------

Hi [~zhangchen], thanks for filing the jira and posting the patch.
I'd like to help reviewing the patch. The fix itself looks good. Here's my 
quick comments, mostly cosmetic:

1.  I am not so sure about the test: you mentioned the client had timeouts 
connecting to DNs, but the test throws ChecksumException -- it is not clear to 
me if client exhibit the same symptom in both scenarios. Perhaps you can use 
{{DFSClientFaultInjector.readFromDatanodeDelay}} to insert a delay instead?

2. Could you add test timeout limit?  For example a 10 second timeout: 
{{\@Test(timeout=100000)}}

3. It appears to me that the test expects to throw BlockMissingException.  
Instead of the following:
{code}
            } catch (BlockMissingException e) {
              assertTrue(true);
            }
{code} Would you mind update the test and use ExpectedException to assert that 
the exception is expected?

4. 
{code}
                if (true) {
                  System.out.println("-------------- throw Checksum Exception");
                  throw new ChecksumException("ChecksumException test", 100);
                }
{code}
Please remove {{if(true)}} and please use DFSClient.LOG instead of 
System.out.println for log printing

5. There's a slight code conflict due to HDFS-11708. Please rebase the patch.

> Hedged read might hang infinitely if read data from all DN failed 
> ------------------------------------------------------------------
>
>                 Key: HDFS-11303
>                 URL: https://issues.apache.org/jira/browse/HDFS-11303
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Chen Zhang
>            Assignee: Chen Zhang
>         Attachments: HDFS-11303-001.patch, HDFS-11303-001.patch, 
> HDFS-11303-002.patch, HDFS-11303-002.patch
>
>
> Hedged read will read from a DN first, if timeout, then read other DNs 
> simultaneously.
> If read all DN failed, this bug will cause the future-list not empty(the 
> first timeout request left in list), and hang in the loop infinitely



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to