[ 
https://issues.apache.org/jira/browse/HDFS-17278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17793958#comment-17793958
 ] 

ASF GitHub Bot commented on HDFS-17278:
---------------------------------------

yijut2 opened a new pull request, #6329:
URL: https://github.com/apache/hadoop/pull/6329

   <!--
     Thanks for sending a pull request!
       1. If this is your first time, please read our contributor guidelines: 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
       2. Make sure your PR title starts with JIRA issue id, e.g., 
'HADOOP-17799. Your PR title ...'.
   -->
   
   ### Description of PR
   The order dependent flakiness was detected if the test class 
`TestDFSClientCache.java` runs before `TestRpcProgramNfs3.java`.
   The error message looks like below:
   ```
   [ERROR] Failures: 
   [ERROR]   TestRpcProgramNfs3.testAccess:279 Incorrect return code 
expected:<0> but was:<13>
   [ERROR]   TestRpcProgramNfs3.testCommit:764 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testCreate:493 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   
TestRpcProgramNfs3.testEncryptedReadWrite:359->createFileUsingNfs:393 Incorrect 
response:  expected:<null> but 
was:<org.apache.hadoop.nfs.nfs3.response.WRITE3Response@42752a9b>
   [ERROR]   TestRpcProgramNfs3.testFsinfo:714 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testFsstat:696 Incorrect return code: 
expected:<0> but was:<13>
   [ERROR]   TestRpcProgramNfs3.testGetattr:205 Incorrect return code 
expected:<0> but was:<13>
   [ERROR]   TestRpcProgramNfs3.testLookup:249 Incorrect return code 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testMkdir:517 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testPathconf:738 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testRead:341 Incorrect return code: 
expected:<0> but was:<13>
   [ERROR]   TestRpcProgramNfs3.testReaddir:642 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testReaddirplus:666 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testReadlink:297 Incorrect return code: 
expected:<0> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testRemove:570 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testRename:618 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testRmdir:594 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testSetattr:225 Incorrect return code 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testSymlink:546 Incorrect return code: 
expected:<13> but was:<5>
   [ERROR]   TestRpcProgramNfs3.testWrite:468 Incorrect return code: 
expected:<13> but was:<5>
   [INFO] 
   [ERROR] Tests run: 25, Failures: 20, Errors: 0, Skipped: 0
   [INFO] 
   [ERROR] There are test failures. 
   ```
   The polluter that led to this flakiness was the test method
   `testGetUserGroupInformationSecure()` in `TestDFSClientCache.java`. There 
was a line `UserGroupInformation.setLoginUser(currentUserUgi);`
   which modifies some shared state and resource, something like pre-setup the 
config. To fix this issue, I added the cleanup methods in 
`TestDFSClientCache.java` to reset the `UserGroupInformation` to ensure the 
isolation among each test class.
   ```
   @AfterClass
   public static void cleanup() {
       UserGroupInformation.reset();
   }
   ```
   Including setting
   ```
   authenticationMethod = null;
   conf = null; // set configuration to null
   setLoginUser(null); // reset login user to default null
   ```
   ..., and so on. The `reset()` methods can be referred to 
`hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java`.
   After the fix, the error was no longer exist and the succeed message was:
   ```
   [INFO] -------------------------------------------------------
   [INFO]  T E S T S
   [INFO] -------------------------------------------------------
   [INFO] Running org.apache.hadoop.hdfs.nfs.nfs3.CustomTest
   [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
18.457 s - in org.apache.hadoop.hdfs.nfs.nfs3.CustomTest
   [INFO] 
   [INFO] Results:
   [INFO] 
   [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0
   [INFO] 
   [INFO] 
------------------------------------------------------------------------
   [INFO] BUILD SUCCESS
   [INFO] ----------------------------------------------------------------------

> Detect order dependent flakiness in TestViewfsWithNfs3.java under 
> hadoop-hdfs-nfs module
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-17278
>                 URL: https://issues.apache.org/jira/browse/HDFS-17278
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>         Environment: openjdk version "17.0.9"
> Apache Maven 3.9.5
>            Reporter: Ruby
>            Priority: Minor
>         Attachments: failed-1.png, failed-2.png, success.png
>
>
> The order dependent flakiness was detected if the test class 
> TestDFSClientCache.java runs before TestRpcProgramNfs3.java.
> The error message looks like below:
> {code:java}
> [ERROR] Failures: 
> [ERROR]   TestRpcProgramNfs3.testAccess:279 Incorrect return code 
> expected:<0> but was:<13>
> [ERROR]   TestRpcProgramNfs3.testCommit:764 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testCreate:493 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   
> TestRpcProgramNfs3.testEncryptedReadWrite:359->createFileUsingNfs:393 
> Incorrect response:  expected:<null> but 
> was:<org.apache.hadoop.nfs.nfs3.response.WRITE3Response@42752a9b>
> [ERROR]   TestRpcProgramNfs3.testFsinfo:714 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testFsstat:696 Incorrect return code: 
> expected:<0> but was:<13>
> [ERROR]   TestRpcProgramNfs3.testGetattr:205 Incorrect return code 
> expected:<0> but was:<13>
> [ERROR]   TestRpcProgramNfs3.testLookup:249 Incorrect return code 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testMkdir:517 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testPathconf:738 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testRead:341 Incorrect return code: expected:<0> 
> but was:<13>
> [ERROR]   TestRpcProgramNfs3.testReaddir:642 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testReaddirplus:666 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testReadlink:297 Incorrect return code: 
> expected:<0> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testRemove:570 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testRename:618 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testRmdir:594 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testSetattr:225 Incorrect return code 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testSymlink:546 Incorrect return code: 
> expected:<13> but was:<5>
> [ERROR]   TestRpcProgramNfs3.testWrite:468 Incorrect return code: 
> expected:<13> but was:<5>
> [INFO] 
> [ERROR] Tests run: 25, Failures: 20, Errors: 0, Skipped: 0
> [INFO] 
> [ERROR] There are test failures. {code}
> The polluter that led to this flakiness was the test method
> testGetUserGroupInformationSecure() in TestDFSClientCache.java. There was a 
> line 
> {code:java}
> UserGroupInformation.setLoginUser(currentUserUgi);{code}
> which modifies some shared state and resource, something like pre-setup the 
> config. To fix this issue, I added the cleanup methods in 
> TestDFSClientCache.java to reset the UserGroupInformation to ensure the 
> isolation among each test class.
> {code:java}
> @AfterClass
> public static void cleanup() {
>     UserGroupInformation.reset();
> }{code}
> Including setting
> {code:java}
> authenticationMethod = null;
> conf = null; // set configuration to null
> setLoginUser(null); // reset login user to default null{code}
> ..., and so on. The reset() methods can be referred to 
> hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java.
> After the fix, the error was no longer exist and the succeed message was:
> {code:java}
> [INFO] -------------------------------------------------------
> [INFO]  T E S T S
> [INFO] -------------------------------------------------------
> [INFO] Running org.apache.hadoop.hdfs.nfs.nfs3.CustomTest
> [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 18.457 s - in org.apache.hadoop.hdfs.nfs.nfs3.CustomTest
> [INFO] 
> [INFO] Results:
> [INFO] 
> [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0
> [INFO] 
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] 
> ------------------------------------------------------------------------ 
> {code}
> Here is the CustomTest.java file that I used to run these two tests in order, 
> the error can be reproduce by running this CustomTest.java. 
> {code:java}
> package org.apache.hadoop.hdfs.nfs.nfs3;
> import org.junit.runner.RunWith;import org.junit.runners.Suite;
> @RunWith(Suite.class)
> @Suite.SuiteClasses({
>     TestDFSClientCache.class,
>     TestRpcProgramNfs3.class
> })
> public class CustomTest {} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to