[jira] [Created] (HADOOP-15487) ConcurrentModificationException resulting in Kerberos authentication error.
Wei-Chiu Chuang created HADOOP-15487: Summary: ConcurrentModificationException resulting in Kerberos authentication error. Key: HADOOP-15487 URL: https://issues.apache.org/jira/browse/HADOOP-15487 Project: Hadoop Common Issue Type: Bug Environment: CDH 5.13.3. Kerberized, Hadoop-HA, jdk1.8.0_152 Reporter: Wei-Chiu Chuang We found the following exception message in a NameNode log. It seems the ConcurrentModificationException caused Kerberos authentication error. It appears to be a JDK bug, similar to HADOOP-13433 (Race in UGI.reloginFromKeytab) but the version of Hadoop (CDH5.13.3) already patched HADOOP-13433. (The stacktrace also differs) This cluster runs on JDK 1.8.0_152. {noformat} 2018-05-19 04:00:00,182 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs/no...@example.com (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 2018-05-19 04:00:00,183 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8020: readAndProcess from client 10.16.20.122 threw exception [java.util.ConcurrentModificationException] java.util.ConcurrentModificationException at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) at java.util.LinkedList$ListItr.next(LinkedList.java:888) at javax.security.auth.Subject$SecureSet$1.next(Subject.java:1070) at javax.security.auth.Subject$ClassSet$1.run(Subject.java:1401) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1399) at javax.security.auth.Subject$ClassSet.(Subject.java:1372) at javax.security.auth.Subject.getPrivateCredentials(Subject.java:767) at sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:127) at sun.security.jgss.krb5.SubjectComber.findMany(SubjectComber.java:69) at sun.security.jgss.krb5.ServiceCreds.getInstance(ServiceCreds.java:96) at sun.security.jgss.krb5.Krb5Util.getServiceCreds(Krb5Util.java:203) at sun.security.jgss.krb5.Krb5AcceptCredential$1.run(Krb5AcceptCredential.java:74) at sun.security.jgss.krb5.Krb5AcceptCredential$1.run(Krb5AcceptCredential.java:72) at java.security.AccessController.doPrivileged(Native Method) at sun.security.jgss.krb5.Krb5AcceptCredential.getInstance(Krb5AcceptCredential.java:71) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:127) at sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:193) at sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:427) at sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:62) at sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:154) at com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108) at com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85) at org.apache.hadoop.security.SaslRpcServer$FastSaslServerFactory.createSaslServer(SaslRpcServer.java:398) at org.apache.hadoop.security.SaslRpcServer$1.run(SaslRpcServer.java:164) at org.apache.hadoop.security.SaslRpcServer$1.run(SaslRpcServer.java:161) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hadoop.security.SaslRpcServer.create(SaslRpcServer.java:160) at org.apache.hadoop.ipc.Server$Connection.createSaslServer(Server.java:1742) at org.apache.hadoop.ipc.Server$Connection.processSaslMessage(Server.java:1522) at org.apache.hadoop.ipc.Server$Connection.saslProcess(Server.java:1433) at org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1396) at org.apache.hadoop.ipc.Server$Connection.processRpcOutOfBandRequest(Server.java:2080) at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1920) at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1682) at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:896) at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:752) at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:723) {noformat} We saw a few GSSException in the NN log, but only one threw the ConcurrentModificationException. This NN had a failover, which is caused by ZKFC having GSSException too. Suspect it's related issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Maven package for hadoop-2.9.1
Hello! It seems that MvnRepository.com doesn't have the new 2.9.1 hadoop-client package. I'm trying to build Spark against those binaries, and it's failing to fetch them. Is there an ETA for when it'll get uploaded? Thanks! --randy
Apache Hadoop qbt Report: trunk+JDK8 on Windows/x64
For more details, see https://builds.apache.org/job/hadoop-trunk-win/474/ [May 20, 2018 2:42:42 PM] (sekikn) YETUS-634 maven plugin dropping '--batch-mode' maven argument [May 21, 2018 10:12:34 AM] (stevel) HADOOP-15478. WASB: hflush() and hsync() regression. Contributed by -1 overall The following subsystems voted -1: compile mvninstall pathlen unit The following subsystems voted -1 but were configured to be filtered/ignored: cc javac The following subsystems are considered long running: (runtime bigger than 1h 00m 00s) unit Specific tests: Failed junit tests : hadoop.crypto.key.kms.server.TestKMS hadoop.cli.TestAclCLI hadoop.cli.TestAclCLIWithPosixAclInheritance hadoop.cli.TestCacheAdminCLI hadoop.cli.TestCryptoAdminCLI hadoop.cli.TestDeleteCLI hadoop.cli.TestErasureCodingCLI hadoop.cli.TestHDFSCLI hadoop.cli.TestXAttrCLI hadoop.fs.contract.hdfs.TestHDFSContractAppend hadoop.fs.contract.hdfs.TestHDFSContractConcat hadoop.fs.contract.hdfs.TestHDFSContractCreate hadoop.fs.contract.hdfs.TestHDFSContractDelete hadoop.fs.contract.hdfs.TestHDFSContractGetFileStatus hadoop.fs.contract.hdfs.TestHDFSContractMkdir hadoop.fs.contract.hdfs.TestHDFSContractOpen hadoop.fs.contract.hdfs.TestHDFSContractPathHandle hadoop.fs.contract.hdfs.TestHDFSContractRename hadoop.fs.contract.hdfs.TestHDFSContractRootDirectory hadoop.fs.contract.hdfs.TestHDFSContractSeek hadoop.fs.contract.hdfs.TestHDFSContractSetTimes hadoop.fs.loadGenerator.TestLoadGenerator hadoop.fs.permission.TestStickyBit hadoop.fs.shell.TestHdfsTextCommand hadoop.fs.TestEnhancedByteBufferAccess hadoop.fs.TestFcHdfsCreateMkdir hadoop.fs.TestFcHdfsPermission hadoop.fs.TestFcHdfsSetUMask hadoop.fs.TestGlobPaths hadoop.fs.TestHDFSFileContextMainOperations hadoop.fs.TestHdfsNativeCodeLoader hadoop.fs.TestResolveHdfsSymlink hadoop.fs.TestSWebHdfsFileContextMainOperations hadoop.fs.TestSymlinkHdfsDisable hadoop.fs.TestSymlinkHdfsFileContext hadoop.fs.TestSymlinkHdfsFileSystem hadoop.fs.TestUnbuffer hadoop.fs.TestUrlStreamHandler hadoop.fs.TestWebHdfsFileContextMainOperations hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot hadoop.fs.viewfs.TestViewFileSystemHdfs hadoop.fs.viewfs.TestViewFileSystemLinkFallback hadoop.fs.viewfs.TestViewFileSystemLinkMergeSlash hadoop.fs.viewfs.TestViewFileSystemWithAcls hadoop.fs.viewfs.TestViewFileSystemWithTruncate hadoop.fs.viewfs.TestViewFileSystemWithXAttrs hadoop.fs.viewfs.TestViewFsAtHdfsRoot hadoop.fs.viewfs.TestViewFsDefaultValue hadoop.fs.viewfs.TestViewFsFileStatusHdfs hadoop.fs.viewfs.TestViewFsHdfs hadoop.fs.viewfs.TestViewFsWithAcls hadoop.fs.viewfs.TestViewFsWithXAttrs hadoop.hdfs.client.impl.TestBlockReaderLocal hadoop.hdfs.client.impl.TestBlockReaderLocalLegacy hadoop.hdfs.client.impl.TestBlockReaderRemote hadoop.hdfs.client.impl.TestClientBlockVerification hadoop.hdfs.crypto.TestHdfsCryptoStreams hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer hadoop.hdfs.qjournal.client.TestEpochsAreUnique hadoop.hdfs.qjournal.client.TestQJMWithFaults hadoop.hdfs.qjournal.client.TestQuorumJournalManager hadoop.hdfs.qjournal.server.TestJournal hadoop.hdfs.qjournal.server.TestJournalNode hadoop.hdfs.qjournal.server.TestJournalNodeMXBean hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.hdfs.qjournal.server.TestJournalNodeSync hadoop.hdfs.qjournal.TestMiniJournalCluster hadoop.hdfs.qjournal.TestNNWithQJM hadoop.hdfs.qjournal.TestSecureNNWithQJM hadoop.hdfs.security.TestDelegationToken hadoop.hdfs.security.TestDelegationTokenForProxyUser hadoop.hdfs.security.token.block.TestBlockToken hadoop.hdfs.server.balancer.TestBalancer hadoop.hdfs.server.balancer.TestBalancerRPCDelay hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer hadoop.hdfs.server.blockmanagement.TestAvailableSpaceBlockPlacementPolicy hadoop.hdfs.server.blockmanagement.TestBlockManager hadoop.hdfs.server.blockmanagement.TestBlockReportRateLimiting hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks
[jira] [Created] (HADOOP-15486) Make NetworkTopology#netLock fair
Nanda kumar created HADOOP-15486: Summary: Make NetworkTopology#netLock fair Key: HADOOP-15486 URL: https://issues.apache.org/jira/browse/HADOOP-15486 Project: Hadoop Common Issue Type: Improvement Components: net Reporter: Nanda kumar Assignee: Nanda kumar Whenever a datanode is restarted, the registration call after the restart received by NameNode lands in {{NetworkTopology#add}} via {{DatanodeManager#registerDatanode}} requires write lock on {{NetworkTopology#netLock}}. This registration thread is getting starved by flood of {{FSNamesystem.getAdditionalDatanode}} calls, which are triggered by clients those who were writing to the restarted datanode. The registration call which is waiting for write lock on {{NetworkTopology#netLock}} is holding write lock on {{FSNamesystem#fsLock}}, causing all the other RPC calls which require the lock on {{FSNamesystem#fsLock}} wait. We can make {{NetworkTopology#netLock}} lock fair so that the registration thread will not starve. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15485) reduce/tune read failure fault injection on inconsistent client
Steve Loughran created HADOOP-15485: --- Summary: reduce/tune read failure fault injection on inconsistent client Key: HADOOP-15485 URL: https://issues.apache.org/jira/browse/HADOOP-15485 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3, test Affects Versions: 3.1.0 Reporter: Steve Loughran If you crank up the s3guard directory inconsistency rate to stress test the directory listings, then the read failure rate can go up high enough that things read IO fails,. Maybe that read injection should only happen for the first few seconds of a stream being created, to better model delayed consistency, or least limit the #of times it can surface in a stream. (This woluld imply some kind of stream-specific binding) Otherwise: provide a way to explicitly set it, including disable -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/787/ No changes -1 overall The following subsystems voted -1: asflicense findbugs pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: FindBugs : module:hadoop-hdds/common Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$CloseContainerRequestProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 18039] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$CloseContainerResponseProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 18601] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$CopyContainerRequestProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 35184] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$CopyContainerResponseProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 36053] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$CreateContainerResponseProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 13089] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$DatanodeBlockID$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 1126] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$DeleteChunkResponseProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 30491] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$DeleteContainerRequestProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 15748] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$DeleteContainerResponseProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 16224] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$DeleteKeyResponseProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 23421] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$KeyValue$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 1767] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ListContainerRequestProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 16726] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ListKeyRequestProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 23958] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$PutKeyResponseProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 21216] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$PutSmallFileResponseProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 33434] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ReadContainerRequestProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 13529] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$UpdateContainerResponseProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 15261] Useless control flow in org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$WriteChunkResponseProto$Builder.maybeForceBuilderInitialization() At ContainerProtos.java: At ContainerProtos.java:[line 27550] Found reliance on default encoding in org.apache.hadoop.utils.MetadataKeyFilters$KeyPrefixFilter.filterKey(byte[], byte[], byte[]):in org.apache.hadoop.utils.MetadataKeyFilters$KeyPrefixFilter.filterKey(byte[], byte[], byte[]): String.getBytes() At MetadataKeyFilters.java:[line 97] Failed junit tests : hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.yarn.client.api.impl.TestAMRMProxy