Re: [VOTE] Release Apache Hadoop 3.1.4 (RC2)
+1, with the instruction "warn everyone about the guava update possibly breaking things at run time" With the key issues being * code compiled with the new guava release will not link against the older releases, even without any changes in the source files. * this includes hadoop-common Applications which exclude the guava dependency published by hadoop- artifacts to use their own, must set guava.version=27.0-jre or guava.version=27.0 to be consistent with that of this release. My tests were all with using the artifacts downstream via maven; I trust others to look at the big tarball release. *Project 1: cloudstore* This is my extra diagnostics and cloud utils module. https://github.com/steveloughran/cloudstore All compiled fine, but the tests failed on guava linkage testNoOverwriteDest(org.apache.hadoop.tools.cloudup.TestLocalCloudup) Time elapsed: 0.012 sec <<< ERROR! java.lang.NoSuchMethodError: 'void com.google.common.base.Preconditions.checkArgument(boolean, java.lang.String, java.lang.Object, java.lang.Object)' at org.apache.hadoop.fs.tools.cloudup.Cloudup.run(Cloudup.java:177) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.tools.store.StoreTestUtils.exec(StoreTestUtils.java:4 Note: that app is designed to run against hadoop branch-2 and other branches, so I ended up reimplementing the checkArgument and checkState calls so that I can have a binary which links everywhere. My code, nothing serious. *Project 2: Spark* apache spark main branch built with maven (not tried the SBT build). mvn -T 1 -Phadoop-3.2 -Dhadoop.version=3.1.4 -Psnapshots-and-staging -Phadoop-cloud,yarn,kinesis-asl -DskipTests clean package All good. Then I ran the committer unit test suite mvn -T 1 -Phadoop-3.2 -Dhadoop.version=3.1.4 -Phadoop-cloud,yarn,kinesis-as -Psnapshots-and-staging --pl hadoop-cloud test CommitterBindingSuite: *** RUN ABORTED *** java.lang.NoSuchMethodError: 'void com.google.common.base.Preconditions.checkArgument(boolean, java.lang.String, java.lang.Object)' at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338) at org.apache.spark.internal.io.cloud.CommitterBindingSuite.newJob(CommitterBindingSuite.scala:89) at org.apache.spark.internal.io.cloud.CommitterBindingSuite.$anonfun$new$1(CommitterBindingSuite.scala:55) at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) ... Fix: again, tell the build this is a later version of Guava: mvn -T 1 -Phadoop-3.2 -Dhadoop.version=3.1.4 -Phadoop-cloud,yarn,kinesis-asl -Psnapshots-and-staging --pl hadoop-cloud -Dguava.version=27.0-jre test the mismatch doesn't break spark internally, they shade their stuff anyway, the guava.version here is actually the one which hadoop is to be linked with. outcome: tests work [INFO] --- scalatest-maven-plugin:2.0.0:test (test) @ spark-hadoop-cloud_2.12 --- Discovery starting. Discovery completed in 438 milliseconds. Run starting. Expected test count is: 4 CommitterBindingSuite: - BindingParquetOutputCommitter binds to the inner committer - committer protocol can be serialized and deserialized - local filesystem instantiation - reject dynamic partitioning Run completed in 1 second, 411 milliseconds. Total number of tests run: 4 Suites: completed 2, aborted 0 Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0 This is a real PITA, and its invariably those checkArgument calls, because the later guava versions added some overloaded methods. Compile existing source with a later guava version and the .class no longer binds to the older guava version, even though no new guava APIs have been adopted. I am really tempted to go through src/**/*.java and replace all Guava checkArgument/checkState with our own implementation in hadoop.common, at least for any which uses the vararg variant. But: it'd be a big change and there may be related issues elsewhere. At least now things fail fast. *Project 3: spark cloud integration * https://github.com/hortonworks-spark/cloud-integration This is where the functional tests for the s3a committer through spark live -Dhadoop.version=3.1.2 -Dspark.version=3.1.0-SNAPSHOT -Psnapshots-and-staging and a full test run mvn test -Dcloud.test.configuration.file=../test-configs/s3a.xml --pl cloud-examples -Dhadoop.version=3.1.2 -Dspark.version=3.1.0-SNAPSHOT -Psnapshots-and-staging All good. A couple of test failures, but that was because one of my test datasets is not on any bucket I have...will have to fix that. To conclude: the artefacts are all there, existing code
Re: [VOTE] Release Apache Hadoop 3.1.4 (RC2)
Mukund -thank you for running these tests. Both of them are things we've fixed, and in both cases, problems in the tests, not the production code On Wed, 1 Jul 2020 at 14:22, Mukund Madhav Thakur wrote: > Compile the distribution using mvn package -Pdist -DskipTests > -Dmaven.javadoc.skip=true -DskipShade and run some hadoop fs commands. All > good there. > > Then I ran the hadoop-aws tests and saw following failures: > > [*ERROR*] *Failures: * > > [*ERROR*] * > ITestS3AMiscOperations.testEmptyFileChecksums:147->Assert.assertEquals:118->Assert.failNotEquals:743->Assert.fail:88 > checksums expected: but > was:* > > [*ERROR*] * > ITestS3AMiscOperations.testNonEmptyFileChecksumsUnencrypted:199->Assert.assertEquals:118->Assert.failNotEquals:743->Assert.fail:88 > checksums expected: but > was:* > you've got a bucket encrypting things so checksums some back different. We've tweaked those tests so on 3.3 we look @ the bucket and skip the test if there's any default encryption policy https://issues.apache.org/jira/browse/HADOOP-16319 > These were the same failures which I saw in RC0 as well. I think these are > known failures. > > > Apart from that, all of my AssumedRole tests are failing AccessDenied > exception like > > [*ERROR*] > testPartialDeleteSingleDelete(org.apache.hadoop.fs.s3a.auth.ITestAssumeRole) > Time elapsed: 3.359 s <<< ERROR! > > org.apache.hadoop.fs.s3a.AWSServiceIOException: initTable on mthakur-data: > com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: User: > arn:aws:sts::152813717728:assumed-role/mthakur-assumed-role/valid is not > authorized to perform: dynamodb:DescribeTable on resource: > arn:aws:dynamodb:ap-south-1:152813717728:table/mthakur-data (Service: > AmazonDynamoDBv2; Status Code: 400; Error Code: AccessDeniedException; > Request ID: UJLKVGJ9I1S9TQF3AEPHVGENVJVV4KQNSO5AEMVJF66Q9ASUAAJG): User: > arn:aws:sts::152813717728:assumed-role/mthakur-assumed-role/valid is not > authorized to perform: dynamodb:DescribeTable on resource: > arn:aws:dynamodb:ap-south-1:152813717728:table/mthakur-data (Service: > AmazonDynamoDBv2; Status Code: 400; Error Code: AccessDeniedException; > Request ID: UJLKVGJ9I1S9TQF3AEPHVGENVJVV4KQNSO5AEMVJF66Q9ASUAAJG) > > at > org.apache.hadoop.fs.s3a.auth.ITestAssumeRole.executePartialDelete(ITestAssumeRole.java:759) > > at > org.apache.hadoop.fs.s3a.auth.ITestAssumeRole.testPartialDeleteSingleDelete(ITestAssumeRole.java:735) > > > I checked my policy and could verify that dynamodb:DescribeTable access is > present there. > > > So just to cross check, I ran the AssumedRole test with the same configs > for apache/trunk and it succeeded. Not sure if this is a false alarm but I > think it would be better if someone else run these AssumedRole tests as > well and verify. > That's https://issues.apache.org/jira/browse/HADOOP-15583 nothing to worry about >>
Apache Hadoop qbt Report: branch2.10+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/735/ [Jul 1, 2020 9:47:52 AM] (ayushsaxena) HADOOP-17090. Increase precommit job timeout from 5 hours to 20 hours. -1 overall The following subsystems voted -1: asflicense findbugs hadolint jshint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml findbugs : module:hadoop-common-project/hadoop-minikdc Possible null pointer dereference in org.apache.hadoop.minikdc.MiniKdc.delete(File) due to return value of called method Dereferenced at MiniKdc.java:org.apache.hadoop.minikdc.MiniKdc.delete(File) due to return value of called method Dereferenced at MiniKdc.java:[line 515] findbugs : module:hadoop-common-project/hadoop-auth org.apache.hadoop.security.authentication.server.MultiSchemeAuthenticationHandler.authenticate(HttpServletRequest, HttpServletResponse) makes inefficient use of keySet iterator instead of entrySet iterator At MultiSchemeAuthenticationHandler.java:of keySet iterator instead of entrySet iterator At MultiSchemeAuthenticationHandler.java:[line 192] findbugs : module:hadoop-common-project/hadoop-common org.apache.hadoop.crypto.CipherSuite.setUnknownValue(int) unconditionally sets the field unknownValue At CipherSuite.java:unknownValue At CipherSuite.java:[line 44] org.apache.hadoop.crypto.CryptoProtocolVersion.setUnknownValue(int) unconditionally sets the field unknownValue At CryptoProtocolVersion.java:unknownValue At CryptoProtocolVersion.java:[line 67] Possible null pointer dereference in org.apache.hadoop.fs.FileUtil.fullyDeleteOnExit(File) due to return value of called method Dereferenced at FileUtil.java:org.apache.hadoop.fs.FileUtil.fullyDeleteOnExit(File) due to return value of called method Dereferenced at FileUtil.java:[line 118] Possible null pointer dereference in org.apache.hadoop.fs.RawLocalFileSystem.handleEmptyDstDirectoryOnWindows(Path, File, Path, File) due to return value of called method Dereferenced at RawLocalFileSystem.java:org.apache.hadoop.fs.RawLocalFileSystem.handleEmptyDstDirectoryOnWindows(Path, File, Path, File) due to return value of called method Dereferenced at RawLocalFileSystem.java:[line 383] Useless condition:lazyPersist == true at this point At CommandWithDestination.java:[line 502] org.apache.hadoop.io.DoubleWritable.compareTo(DoubleWritable) incorrectly handles double value At DoubleWritable.java: At DoubleWritable.java:[line 78] org.apache.hadoop.io.DoubleWritable$Comparator.compare(byte[], int, int, byte[], int, int) incorrectly handles double value At DoubleWritable.java:int) incorrectly handles double value At DoubleWritable.java:[line 97] org.apache.hadoop.io.FloatWritable.compareTo(FloatWritable) incorrectly handles float value At FloatWritable.java: At FloatWritable.java:[line 71] org.apache.hadoop.io.FloatWritable$Comparator.compare(byte[], int, int, byte[], int, int) incorrectly handles float value At FloatWritable.java:int) incorrectly handles float value At FloatWritable.java:[line 89] Possible null pointer dereference in org.apache.hadoop.io.IOUtils.listDirectory(File, FilenameFilter) due to return value of called method Dereferenced at IOUtils.java:org.apache.hadoop.io.IOUtils.listDirectory(File, FilenameFilter) due to return value of called method Dereferenced at IOUtils.java:[line 389] Possible bad parsing of shift operation in org.apache.hadoop.io.file.tfile.Utils$Version.hashCode() At Utils.java:operation in org.apache.hadoop.io.file.tfile.Utils$Version.hashCode() At Utils.java:[line 398] org.apache.hadoop.metrics2.lib.DefaultMetricsFactory.setInstance(MutableMetricsFactory) unconditionally sets the field mmfImpl At DefaultMetricsFactory.java:mmfImpl At DefaultMetricsFactory.java:[line 49] org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.setMiniClusterMode(boolean) unconditionally sets the field miniClusterMode At DefaultMetricsSystem.java:miniClusterMode At DefaultMetricsSystem.java:[line 92] Useless object stored in variable seqOs of method org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.addOrUpdateToken(AbstractDelegationTokenIdentifier,