[
https://issues.apache.org/jira/browse/HDFS-17278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ayush Saxena resolved HDFS-17278.
---------------------------------
Fix Version/s: 3.4.0
Hadoop Flags: Reviewed
Resolution: Fixed
> Detect order dependent flakiness in TestViewfsWithNfs3.java under
> hadoop-hdfs-nfs module
> ----------------------------------------------------------------------------------------
>
> Key: HDFS-17278
> URL: https://issues.apache.org/jira/browse/HDFS-17278
> Project: Hadoop HDFS
> Issue Type: Bug
> Environment: openjdk version "17.0.9"
> Apache Maven 3.9.5
> Reporter: Ruby
> Assignee: Ruby
> Priority: Minor
> Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: failed-1.png, failed-2.png, success.png
>
>
> The order dependent flakiness was detected if the test class
> TestDFSClientCache.java runs before TestRpcProgramNfs3.java.
> The error message looks like below:
> {code:java}
> [ERROR] Failures:
> [ERROR] TestRpcProgramNfs3.testAccess:279 Incorrect return code
> expected:<0> but was:<13>
> [ERROR] TestRpcProgramNfs3.testCommit:764 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testCreate:493 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR]
> TestRpcProgramNfs3.testEncryptedReadWrite:359->createFileUsingNfs:393
> Incorrect response: expected:<null> but
> was:<org.apache.hadoop.nfs.nfs3.response.WRITE3Response@42752a9b>
> [ERROR] TestRpcProgramNfs3.testFsinfo:714 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testFsstat:696 Incorrect return code:
> expected:<0> but was:<13>
> [ERROR] TestRpcProgramNfs3.testGetattr:205 Incorrect return code
> expected:<0> but was:<13>
> [ERROR] TestRpcProgramNfs3.testLookup:249 Incorrect return code
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testMkdir:517 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testPathconf:738 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testRead:341 Incorrect return code: expected:<0>
> but was:<13>
> [ERROR] TestRpcProgramNfs3.testReaddir:642 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testReaddirplus:666 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testReadlink:297 Incorrect return code:
> expected:<0> but was:<5>
> [ERROR] TestRpcProgramNfs3.testRemove:570 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testRename:618 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testRmdir:594 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testSetattr:225 Incorrect return code
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testSymlink:546 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testWrite:468 Incorrect return code:
> expected:<13> but was:<5>
> [INFO]
> [ERROR] Tests run: 25, Failures: 20, Errors: 0, Skipped: 0
> [INFO]
> [ERROR] There are test failures. {code}
> The polluter that led to this flakiness was the test method
> testGetUserGroupInformationSecure() in TestDFSClientCache.java. There was a
> line
> {code:java}
> UserGroupInformation.setLoginUser(currentUserUgi);{code}
> which modifies some shared state and resource, something like pre-setup the
> config. To fix this issue, I added the cleanup methods in
> TestDFSClientCache.java to reset the UserGroupInformation to ensure the
> isolation among each test class.
> {code:java}
> @AfterClass
> public static void cleanup() {
> UserGroupInformation.reset();
> }{code}
> Including setting
> {code:java}
> authenticationMethod = null;
> conf = null; // set configuration to null
> setLoginUser(null); // reset login user to default null{code}
> ..., and so on. The reset() methods can be referred to
> hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java.
> After the fix, the error was no longer exist and the succeed message was:
> {code:java}
> [INFO] -------------------------------------------------------
> [INFO] T E S T S
> [INFO] -------------------------------------------------------
> [INFO] Running org.apache.hadoop.hdfs.nfs.nfs3.CustomTest
> [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 18.457 s - in org.apache.hadoop.hdfs.nfs.nfs3.CustomTest
> [INFO]
> [INFO] Results:
> [INFO]
> [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0
> [INFO]
> [INFO]
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO]
> ------------------------------------------------------------------------
> {code}
> Here is the CustomTest.java file that I used to run these two tests in order,
> the error can be reproduce by running this CustomTest.java.
> {code:java}
> package org.apache.hadoop.hdfs.nfs.nfs3;
> import org.junit.runner.RunWith;import org.junit.runners.Suite;
> @RunWith(Suite.class)
> @Suite.SuiteClasses({
> TestDFSClientCache.class,
> TestRpcProgramNfs3.class
> })
> public class CustomTest {} {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]