[ 
https://issues.apache.org/jira/browse/HDFS-15125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17018286#comment-17018286
 ] 

Ahmed Hussein commented on HDFS-15125:
--------------------------------------

*(non-binding)*
[~Jim_Brennan], Can we add a message in DataNodeTestUtils.waitForDiskError() 
when something goes wrong? This will make it easy to see the error when a test 
fails. For example, catch the exception and print a message:

{code:java}
  public static void waitForDiskError(final DataNode dn, FsVolumeSpi volume)
      throws Exception {
    LOG.info("Starting to wait for datanode to detect disk failure.");
    final long lastDiskErrorCheck = dn.getLastDiskErrorCheck();
    dn.checkDiskErrorAsync(volume);
    // Wait 10 seconds for checkDiskError thread to finish and discover volume
    // failures.
    try {
      GenericTestUtils.waitFor(new Supplier<Boolean>() {
        @Override
        public Boolean get() {
          return dn.getLastDiskErrorCheck() != lastDiskErrorCheck;
        }
      }, 100, 10000);
    } catch (Exception ex) {
      LOG.error("Timeout waiting for disk error " + dn
              + "Volume:" + volume.getStorageID());
      throw ex;
    }
  }
}
{code}


> Pull back HDFS-11353, HDFS-13993, HDFS-13945, and HDFS-14324 to branch-2.10
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-15125
>                 URL: https://issues.apache.org/jira/browse/HDFS-15125
>             Project: Hadoop HDFS
>          Issue Type: Test
>          Components: hdfs
>    Affects Versions: 2.10.0
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Minor
>         Attachments: HDFS-15125-branch-2.10.001.patch, 
> HDFS-15125-branch-2.10.002.patch
>
>
> I would like to pull back some fixes for the DataNodeVolume* tests to resolve 
> some intermittent failures we are seeing on branch-2.10.
> The fixes are:
> HDFS-11353 Improve the unit tests relevant to DataNode volume failure testing
> HDFS-13993 
> TestDataNodeVolumeFailure#testTolerateVolumeFailuresAfterAddingMoreVolumes is 
> flaky
> HDFS-14324 Fix TestDataNodeVolumeFailure
> HDFS-13945 TestDataNodeVolumeFailure is Flaky



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to