[
https://issues.apache.org/jira/browse/HADOOP-19071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819215#comment-17819215
]
ASF GitHub Bot commented on HADOOP-19071:
-----------------------------------------
steveloughran commented on PR #6537:
URL: https://github.com/apache/hadoop/pull/6537#issuecomment-1956544980
This is not good.
But looking at the failures I don't know whether to categorise as "test
runner regression" or "brittle tests failing under new test runner".
Here are some of the ones I've looked at
`TestDirectoryScanner.testThrottling`
This test is measuring how long things took. it is way too brittle against
timing changes, both slower and faster.
```
java.lang.AssertionError: Throttle is too permissive
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.assertTrue(Assert.java:42)
at
org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.testThrottling(TestDirectoryScanner.java:901)
```
I think the step here is to move to assertj so asserts fail with meaningful
messages, see if the failure can be understood. Ideally you'd want a test which
doesn't measure elapsed time, but instead uses counters in the code (here: of
throttle events) to assert what took place.
Test` TestBlockListAsLongs.testFuzz`
See this painfully often else where -it means that the protobuf lib was
built with a more recent version of java8 than the early oracle ones. Its
fixable in your own build (use the older one) or cast ByteBuffer to Buffer.
otherwise we need to make sure tests are on a more recent build.
```
java.lang.NoSuchMethodError:
java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer;
at
org.apache.hadoop.thirdparty.protobuf.IterableByteBufferInputStream.read(IterableByteBufferInputStream.java:143)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.read(CodedInputStream.java:2080)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.tryRefillBuffer(CodedInputStream.java:2831)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.refillBuffer(CodedInputStream.java:2777)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readRawByte(CodedInputStream.java:2859)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readRawVarint64SlowPath(CodedInputStream.java:2648)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readRawVarint64(CodedInputStream.java:2641)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readSInt64(CodedInputStream.java:2497)
at
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:419)
at
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:397)
at
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:375)
at
org.apache.hadoop.hdfs.protocol.TestBlockListAsLongs.checkReport(TestBlockListAsLongs.java:156)
at
org.apache.hadoop.hdfs.protocol.TestBlockListAsLongs.testFuzz(TestBlockListAsLongs.java:139)
```
test `TestDFSAdmin.testDecommissionDataNodesReconfig`
```
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:87)
at org.junit.Assert.assertTrue(Assert.java:42)
at org.junit.Assert.assertTrue(Assert.java:53)
at
org.apache.hadoop.hdfs.tools.TestDFSAdmin.testDecommissionDataNodesReconfig(TestDFSAdmin.java:1356)
```
not a very meaningful message. suspect that a different ordering of the
threads is causing the assert to fail.
1. move to AssertJ
2. analyse error, see what the fix is.
Test `TestCacheDirectives`.
```
at org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:403)
at
org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:362)
at
org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.waitForCachedBlocks(TestCacheDirectives.java:760)
at
org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.teardown(TestCacheDirectives.java:173)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
```
this is a timeout during teardown; after this subsequent tests are possibly
going to fail. No obvious cause, though again I'd suspect race conditions.
Rather than say "hey, let's revert", I'd propose a "surefire update triggers
test failures" and see what can be done about addressing them. because we can't
stay frozen with surefire versions.
> Update maven-surefire-plugin from 3.0.0 to 3.2.5
> -------------------------------------------------
>
> Key: HADOOP-19071
> URL: https://issues.apache.org/jira/browse/HADOOP-19071
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: build, common
> Affects Versions: 3.4.0, 3.5.0
> Reporter: Shilun Fan
> Assignee: Shilun Fan
> Priority: Major
> Labels: pull-request-available
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]