steveloughran commented on PR #6537:
URL: https://github.com/apache/hadoop/pull/6537#issuecomment-1956544980
This is not good.
But looking at the failures I don't know whether to categorise as "test
runner regression" or "brittle tests failing under new test runner".
Here are some of the ones I've looked at
`TestDirectoryScanner.testThrottling`
This test is measuring how long things took. it is way too brittle against
timing changes, both slower and faster.
```
java.lang.AssertionError: Throttle is too permissive
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.assertTrue(Assert.java:42)
at
org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.testThrottling(TestDirectoryScanner.java:901)
```
I think the step here is to move to assertj so asserts fail with meaningful
messages, see if the failure can be understood. Ideally you'd want a test which
doesn't measure elapsed time, but instead uses counters in the code (here: of
throttle events) to assert what took place.
Test` TestBlockListAsLongs.testFuzz`
See this painfully often else where -it means that the protobuf lib was
built with a more recent version of java8 than the early oracle ones. Its
fixable in your own build (use the older one) or cast ByteBuffer to Buffer.
otherwise we need to make sure tests are on a more recent build.
```
java.lang.NoSuchMethodError:
java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer;
at
org.apache.hadoop.thirdparty.protobuf.IterableByteBufferInputStream.read(IterableByteBufferInputStream.java:143)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.read(CodedInputStream.java:2080)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.tryRefillBuffer(CodedInputStream.java:2831)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.refillBuffer(CodedInputStream.java:2777)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readRawByte(CodedInputStream.java:2859)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readRawVarint64SlowPath(CodedInputStream.java:2648)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readRawVarint64(CodedInputStream.java:2641)
at
org.apache.hadoop.thirdparty.protobuf.CodedInputStream$StreamDecoder.readSInt64(CodedInputStream.java:2497)
at
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:419)
at
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:397)
at
org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:375)
at
org.apache.hadoop.hdfs.protocol.TestBlockListAsLongs.checkReport(TestBlockListAsLongs.java:156)
at
org.apache.hadoop.hdfs.protocol.TestBlockListAsLongs.testFuzz(TestBlockListAsLongs.java:139)
```
test `TestDFSAdmin.testDecommissionDataNodesReconfig`
```
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:87)
at org.junit.Assert.assertTrue(Assert.java:42)
at org.junit.Assert.assertTrue(Assert.java:53)
at
org.apache.hadoop.hdfs.tools.TestDFSAdmin.testDecommissionDataNodesReconfig(TestDFSAdmin.java:1356)
```
not a very meaningful message. suspect that a different ordering of the
threads is causing the assert to fail.
1. move to AssertJ
2. analyse error, see what the fix is.
Test `TestCacheDirectives`.
```
at org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:403)
at
org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:362)
at
org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.waitForCachedBlocks(TestCacheDirectives.java:760)
at
org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.teardown(TestCacheDirectives.java:173)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
```
this is a timeout during teardown; after this subsequent tests are possibly
going to fail. No obvious cause, though again I'd suspect race conditions.
Rather than say "hey, let's revert", I'd propose a "surefire update triggers
test failures" and see what can be done about addressing them. because we can't
stay frozen with surefire versions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]