[
https://issues.apache.org/jira/browse/HDFS-17278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17793958#comment-17793958
]
ASF GitHub Bot commented on HDFS-17278:
---------------------------------------
yijut2 opened a new pull request, #6329:
URL: https://github.com/apache/hadoop/pull/6329
<!--
Thanks for sending a pull request!
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
2. Make sure your PR title starts with JIRA issue id, e.g.,
'HADOOP-17799. Your PR title ...'.
-->
### Description of PR
The order dependent flakiness was detected if the test class
`TestDFSClientCache.java` runs before `TestRpcProgramNfs3.java`.
The error message looks like below:
```
[ERROR] Failures:
[ERROR] TestRpcProgramNfs3.testAccess:279 Incorrect return code
expected:<0> but was:<13>
[ERROR] TestRpcProgramNfs3.testCommit:764 Incorrect return code:
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testCreate:493 Incorrect return code:
expected:<13> but was:<5>
[ERROR]
TestRpcProgramNfs3.testEncryptedReadWrite:359->createFileUsingNfs:393 Incorrect
response: expected:<null> but
was:<org.apache.hadoop.nfs.nfs3.response.WRITE3Response@42752a9b>
[ERROR] TestRpcProgramNfs3.testFsinfo:714 Incorrect return code:
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testFsstat:696 Incorrect return code:
expected:<0> but was:<13>
[ERROR] TestRpcProgramNfs3.testGetattr:205 Incorrect return code
expected:<0> but was:<13>
[ERROR] TestRpcProgramNfs3.testLookup:249 Incorrect return code
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testMkdir:517 Incorrect return code:
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testPathconf:738 Incorrect return code:
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testRead:341 Incorrect return code:
expected:<0> but was:<13>
[ERROR] TestRpcProgramNfs3.testReaddir:642 Incorrect return code:
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testReaddirplus:666 Incorrect return code:
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testReadlink:297 Incorrect return code:
expected:<0> but was:<5>
[ERROR] TestRpcProgramNfs3.testRemove:570 Incorrect return code:
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testRename:618 Incorrect return code:
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testRmdir:594 Incorrect return code:
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testSetattr:225 Incorrect return code
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testSymlink:546 Incorrect return code:
expected:<13> but was:<5>
[ERROR] TestRpcProgramNfs3.testWrite:468 Incorrect return code:
expected:<13> but was:<5>
[INFO]
[ERROR] Tests run: 25, Failures: 20, Errors: 0, Skipped: 0
[INFO]
[ERROR] There are test failures.
```
The polluter that led to this flakiness was the test method
`testGetUserGroupInformationSecure()` in `TestDFSClientCache.java`. There
was a line `UserGroupInformation.setLoginUser(currentUserUgi);`
which modifies some shared state and resource, something like pre-setup the
config. To fix this issue, I added the cleanup methods in
`TestDFSClientCache.java` to reset the `UserGroupInformation` to ensure the
isolation among each test class.
```
@AfterClass
public static void cleanup() {
UserGroupInformation.reset();
}
```
Including setting
```
authenticationMethod = null;
conf = null; // set configuration to null
setLoginUser(null); // reset login user to default null
```
..., and so on. The `reset()` methods can be referred to
`hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java`.
After the fix, the error was no longer exist and the succeed message was:
```
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hdfs.nfs.nfs3.CustomTest
[INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
18.457 s - in org.apache.hadoop.hdfs.nfs.nfs3.CustomTest
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO]
------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ----------------------------------------------------------------------
> Detect order dependent flakiness in TestViewfsWithNfs3.java under
> hadoop-hdfs-nfs module
> ----------------------------------------------------------------------------------------
>
> Key: HDFS-17278
> URL: https://issues.apache.org/jira/browse/HDFS-17278
> Project: Hadoop HDFS
> Issue Type: New Feature
> Environment: openjdk version "17.0.9"
> Apache Maven 3.9.5
> Reporter: Ruby
> Priority: Minor
> Attachments: failed-1.png, failed-2.png, success.png
>
>
> The order dependent flakiness was detected if the test class
> TestDFSClientCache.java runs before TestRpcProgramNfs3.java.
> The error message looks like below:
> {code:java}
> [ERROR] Failures:
> [ERROR] TestRpcProgramNfs3.testAccess:279 Incorrect return code
> expected:<0> but was:<13>
> [ERROR] TestRpcProgramNfs3.testCommit:764 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testCreate:493 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR]
> TestRpcProgramNfs3.testEncryptedReadWrite:359->createFileUsingNfs:393
> Incorrect response: expected:<null> but
> was:<org.apache.hadoop.nfs.nfs3.response.WRITE3Response@42752a9b>
> [ERROR] TestRpcProgramNfs3.testFsinfo:714 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testFsstat:696 Incorrect return code:
> expected:<0> but was:<13>
> [ERROR] TestRpcProgramNfs3.testGetattr:205 Incorrect return code
> expected:<0> but was:<13>
> [ERROR] TestRpcProgramNfs3.testLookup:249 Incorrect return code
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testMkdir:517 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testPathconf:738 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testRead:341 Incorrect return code: expected:<0>
> but was:<13>
> [ERROR] TestRpcProgramNfs3.testReaddir:642 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testReaddirplus:666 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testReadlink:297 Incorrect return code:
> expected:<0> but was:<5>
> [ERROR] TestRpcProgramNfs3.testRemove:570 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testRename:618 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testRmdir:594 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testSetattr:225 Incorrect return code
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testSymlink:546 Incorrect return code:
> expected:<13> but was:<5>
> [ERROR] TestRpcProgramNfs3.testWrite:468 Incorrect return code:
> expected:<13> but was:<5>
> [INFO]
> [ERROR] Tests run: 25, Failures: 20, Errors: 0, Skipped: 0
> [INFO]
> [ERROR] There are test failures. {code}
> The polluter that led to this flakiness was the test method
> testGetUserGroupInformationSecure() in TestDFSClientCache.java. There was a
> line
> {code:java}
> UserGroupInformation.setLoginUser(currentUserUgi);{code}
> which modifies some shared state and resource, something like pre-setup the
> config. To fix this issue, I added the cleanup methods in
> TestDFSClientCache.java to reset the UserGroupInformation to ensure the
> isolation among each test class.
> {code:java}
> @AfterClass
> public static void cleanup() {
> UserGroupInformation.reset();
> }{code}
> Including setting
> {code:java}
> authenticationMethod = null;
> conf = null; // set configuration to null
> setLoginUser(null); // reset login user to default null{code}
> ..., and so on. The reset() methods can be referred to
> hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java.
> After the fix, the error was no longer exist and the succeed message was:
> {code:java}
> [INFO] -------------------------------------------------------
> [INFO] T E S T S
> [INFO] -------------------------------------------------------
> [INFO] Running org.apache.hadoop.hdfs.nfs.nfs3.CustomTest
> [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
> 18.457 s - in org.apache.hadoop.hdfs.nfs.nfs3.CustomTest
> [INFO]
> [INFO] Results:
> [INFO]
> [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0
> [INFO]
> [INFO]
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO]
> ------------------------------------------------------------------------
> {code}
> Here is the CustomTest.java file that I used to run these two tests in order,
> the error can be reproduce by running this CustomTest.java.
> {code:java}
> package org.apache.hadoop.hdfs.nfs.nfs3;
> import org.junit.runner.RunWith;import org.junit.runners.Suite;
> @RunWith(Suite.class)
> @Suite.SuiteClasses({
> TestDFSClientCache.class,
> TestRpcProgramNfs3.class
> })
> public class CustomTest {} {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]