[jira] [Created] (HADOOP-14266) S3Guard: S3AFileSystem::listFiles() to employ MetadataStore
Mingliang Liu created HADOOP-14266: -- Summary: S3Guard: S3AFileSystem::listFiles() to employ MetadataStore Key: HADOOP-14266 URL: https://issues.apache.org/jira/browse/HADOOP-14266 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: HADOOP-13345 Reporter: Mingliang Liu Similar to [HADOOP-13926], this is to track the effort of employing MetadataStore in {{S3AFileSystem::listFiles()}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-14198) Should have a way to let PingInputStream to abort
[ https://issues.apache.org/jira/browse/HADOOP-14198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang resolved HADOOP-14198. Resolution: Duplicate > Should have a way to let PingInputStream to abort > - > > Key: HADOOP-14198 > URL: https://issues.apache.org/jira/browse/HADOOP-14198 > Project: Hadoop Common > Issue Type: Bug >Reporter: Yongjun Zhang > > We observed a case that RPC call get stuck, since PingInputStream does the > following > {code} > /** This class sends a ping to the remote side when timeout on > * reading. If no failure is detected, it retries until at least > * a byte is read. > */ > private class PingInputStream extends FilterInputStream { > {code} > It seems that in this case no data is ever received, and it keeps pinging. > Should we ping forever here? Maybe we should introduce a config to stop the > ping after pinging for certain number of times, and report back timeout, let > the caller to retry the RPC? > Wonder if there is chance the RPC get dropped somehow by the server so no > response is ever received. > See > {code} > Thread 16127: (state = BLOCKED) > > - sun.nio.ch.SocketChannelImpl.readerCleanup() @bci=6, line=279 (Compiled > frame) > - sun.nio.ch.SocketChannelImpl.read(java.nio.ByteBuffer) @bci=205, line=390 > (Compiled frame) > - > org.apache.hadoop.net.SocketInputStream$Reader.performIO(java.nio.ByteBuffer) > @bci=5, line=57 (Compiled frame) > - org.apache.hadoop.net.SocketIOWithTimeout.doIO(java.nio.ByteBuffer, int) > @bci=35, line=142 (Compiled frame) > - org.apache.hadoop.net.SocketInputStream.read(java.nio.ByteBuffer) @bci=6, > line=161 (Compiled frame) > - org.apache.hadoop.net.SocketInputStream.read(byte[], int, int) @bci=7, > line=131 (Compiled frame) > - java.io.FilterInputStream.read(byte[], int, int) @bci=7, line=133 > (Compiled frame) > - java.io.FilterInputStream.read(byte[], int, int) @bci=7, line=133 > (Compiled frame) > - org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(byte[], int, > int) @bci=4, line=521 (Compiled frame) > - java.io.BufferedInputStream.fill() @bci=214, line=246 (Compiled frame) > > - java.io.BufferedInputStream.read() @bci=12, line=265 (Compiled frame) > > - java.io.DataInputStream.readInt() @bci=4, line=387 (Compiled frame) > > - org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse() @bci=19, > line=1081 (Compiled frame) > - org.apache.hadoop.ipc.Client$Connection.run() @bci=62, line=976 (Compiled > frame) > {code} > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14265) AuthenticatedURL swallows the exception received from server.
Rushabh S Shah created HADOOP-14265: --- Summary: AuthenticatedURL swallows the exception received from server. Key: HADOOP-14265 URL: https://issues.apache.org/jira/browse/HADOOP-14265 Project: Hadoop Common Issue Type: Bug Reporter: Rushabh S Shah Assignee: Rushabh S Shah While debugging some issue with kms server, we found out that AuthenticatedURL swallows the original exception from server and constructed {{AuthenticationException}} with response code and response message. Below is the stack trace which didn't help in figuring out why the getDelegationTokens call failed. Due to lack of info logs on the kms server side also, this made the debugging even harder. {noformat} 2017-03-23 16:32:10,364 ERROR [HiveServer2-Background-Pool: Thread-17795] [] exec.Task (TezTask.java:execute(197)) - Failed to execute tez graph. java.io.IOException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:1042) at org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:110) at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2444) at org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(TokenCache.java:107) at org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(TokenCache.java:86) at org.apache.tez.common.security.TokenCache.obtainTokensForFileSystems(TokenCache.java:76) at org.apache.tez.client.TezClientUtils.addLocalResources(TezClientUtils.java:301) at org.apache.tez.client.TezClientUtils.setupTezJarsLocalResources(TezClientUtils.java:180) at org.apache.tez.client.TezClient.getTezJarResources(TezClient.java:911) ... Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1954) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:1023) ... 31 more Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, status: 403, message: Forbidden at org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:275) at org.apache.hadoop.security.authentication.client.PseudoAuthenticator.authenticate(PseudoAuthenticator.java:77) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:132) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:214) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:132) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:298) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:170) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:377) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$5.run(KMSClientProvider.java:1028) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$5.run(KMSClientProvider.java:1023) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1936) ... 32 more {noformat} Following is the relevant chunk of code from branch-2.8 but the code in trunks hasn't changed much. {code:title=AuthenticatedURL.java|borderStyle=solid} public static void extractToken(HttpURLConnection conn, Token token) throws IOException, AuthenticationException { int respCode = conn.getResponseCode(); if (notExpectedResponseCode) { } else { token.set(null); throw new AuthenticationException("Authentication failed, status: " + conn.getResponseCode() + ", message: " + conn.getResponseMessage()); } } {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/ [Mar 30, 2017 9:11:50 AM] (aajisaka) HADOOP-14256. [S3A DOC] Correct the format for "Seoul" example. [Mar 30, 2017 3:57:19 PM] (jlowe) MAPREDUCE-6850. Shuffle Handler keep-alive connections are closed from [Mar 30, 2017 5:17:11 PM] (cdouglas) HADOOP-14250. Correct spelling of 'separate' and variants. Contributed [Mar 30, 2017 5:55:32 PM] (jlowe) MAPREDUCE-6836. exception thrown when accessing the job configuration [Mar 30, 2017 6:16:05 PM] (weichiu) HDFS-10974. Document replication factor for EC files. Contributed by [Mar 30, 2017 7:14:43 PM] (jeagles) HADOOP-14216. Addendum to Improve Configuration XML Parsing Performance [Mar 30, 2017 8:32:57 PM] (varunsaxena) YARN-6342. Make TimelineV2Client's drain timeout after stop configurable [Mar 30, 2017 8:47:20 PM] (varunsaxena) YARN-6376. Exceptions caused by synchronous putEntities requests can be [Mar 30, 2017 10:44:21 PM] (liuml07) HDFS-11592. Closing a file has a wasteful preconditions in NameNode. [Mar 31, 2017 12:01:15 AM] (yzhang) HADOOP-11794. Enable distcp to copy blocks in parallel. Contributed by [Mar 31, 2017 12:38:18 AM] (yzhang) Revert "HADOOP-11794. Enable distcp to copy blocks in parallel. [Mar 31, 2017 12:38:56 AM] (yzhang) HADOOP-11794. Enable distcp to copy blocks in parallel. Contributed by [Mar 31, 2017 5:41:26 AM] (arp) HDFS-11551. Handle SlowDiskReport from DataNode at the NameNode. -1 overall The following subsystems voted -1: asflicense unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed junit tests : hadoop.fs.sftp.TestSFTPFileSystem hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure hadoop.hdfs.server.balancer.TestBalancer hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 hadoop.hdfs.TestReadStripedFileWithMissingBlocks hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 hadoop.hdfs.server.datanode.TestDataNodeUUID hadoop.yarn.server.nodemanager.containermanager.TestContainerManager hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer hadoop.yarn.server.TestContainerManagerSecurity hadoop.yarn.server.TestMiniYarnClusterNodeUtilization hadoop.yarn.client.api.impl.TestAMRMClient hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities hadoop.mapred.TestMRTimelineEventHandling hadoop.mapreduce.TestMRJobClient hadoop.tools.TestDistCpSystem Timed out junit tests : org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStorePerf cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/artifact/out/diff-compile-javac-root.txt [184K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/artifact/out/diff-checkstyle-root.txt [17M] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/artifact/out/diff-patch-pylint.txt [20K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/artifact/out/diff-patch-shellcheck.txt [24K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/artifact/out/diff-patch-shelldocs.txt [12K] whitespace: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/artifact/out/whitespace-tabs.txt [1.2M] javadoc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/artifact/out/diff-javadoc-javadoc-root.txt [2.2M] unit: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/362/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt [144K]
Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/274/ [Mar 30, 2017 3:57:19 PM] (jlowe) MAPREDUCE-6850. Shuffle Handler keep-alive connections are closed from [Mar 30, 2017 5:17:11 PM] (cdouglas) HADOOP-14250. Correct spelling of 'separate' and variants. Contributed [Mar 30, 2017 5:55:32 PM] (jlowe) MAPREDUCE-6836. exception thrown when accessing the job configuration [Mar 30, 2017 6:16:05 PM] (weichiu) HDFS-10974. Document replication factor for EC files. Contributed by [Mar 30, 2017 7:14:43 PM] (jeagles) HADOOP-14216. Addendum to Improve Configuration XML Parsing Performance [Mar 30, 2017 8:32:57 PM] (varunsaxena) YARN-6342. Make TimelineV2Client's drain timeout after stop configurable [Mar 30, 2017 8:47:20 PM] (varunsaxena) YARN-6376. Exceptions caused by synchronous putEntities requests can be [Mar 30, 2017 10:44:21 PM] (liuml07) HDFS-11592. Closing a file has a wasteful preconditions in NameNode. [Mar 31, 2017 12:01:15 AM] (yzhang) HADOOP-11794. Enable distcp to copy blocks in parallel. Contributed by [Mar 31, 2017 12:38:18 AM] (yzhang) Revert "HADOOP-11794. Enable distcp to copy blocks in parallel. [Mar 31, 2017 12:38:56 AM] (yzhang) HADOOP-11794. Enable distcp to copy blocks in parallel. Contributed by [Mar 31, 2017 5:41:26 AM] (arp) HDFS-11551. Handle SlowDiskReport from DataNode at the NameNode. -1 overall The following subsystems voted -1: compile unit The following subsystems voted -1 but were configured to be filtered/ignored: cc javac The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed junit tests : hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting hadoop.hdfs.server.mover.TestStorageMover hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.hdfs.server.blockmanagement.TestSlowDiskTracker hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation hadoop.yarn.server.timeline.TestRollingLevelDB hadoop.yarn.server.timeline.TestTimelineDataManager hadoop.yarn.server.timeline.TestLeveldbTimelineStore hadoop.yarn.server.timeline.recovery.TestLeveldbTimelineStateStore hadoop.yarn.server.timeline.TestRollingLevelDBTimelineStore hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore hadoop.yarn.server.resourcemanager.TestRMRestart hadoop.yarn.server.TestMiniYarnClusterNodeUtilization hadoop.yarn.server.TestContainerManagerSecurity hadoop.yarn.server.timeline.TestLevelDBCacheTimelineStore hadoop.yarn.server.timeline.TestOverrideTimelineStoreYarnClient hadoop.yarn.server.timeline.TestEntityGroupFSTimelineStore hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity hadoop.yarn.applications.distributedshell.TestDistributedShell hadoop.mapred.TestShuffleHandler hadoop.mapreduce.v2.app.TestRuntimeEstimators hadoop.mapreduce.v2.hs.TestHistoryServerLeveldbStateStoreService Timed out junit tests : org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache compile: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/274/artifact/out/patch-compile-root.txt [140K] cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/274/artifact/out/patch-compile-root.txt [140K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/274/artifact/out/patch-compile-root.txt [140K] unit: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/274/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [488K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/274/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt [16K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/274/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt [52K]
[jira] [Created] (HADOOP-14264) Add contract-test-options.xml to .gitignore
Akira Ajisaka created HADOOP-14264: -- Summary: Add contract-test-options.xml to .gitignore Key: HADOOP-14264 URL: https://issues.apache.org/jira/browse/HADOOP-14264 Project: Hadoop Common Issue Type: Improvement Reporter: Akira Ajisaka Priority: Minor contract-test-options.xml is used for FileSystem contract tests and created by developers. The file should be ignored as well as auth-keys.xml and azure-auth-keys.xml. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14263) TestS3GuardTool hangs/fails when offline: it's an IT test
Steve Loughran created HADOOP-14263: --- Summary: TestS3GuardTool hangs/fails when offline: it's an IT test Key: HADOOP-14263 URL: https://issues.apache.org/jira/browse/HADOOP-14263 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3, test Affects Versions: HADOOP-13345 Reporter: Steve Loughran If you try to run aws tests while offline, {{TestS3GuardTool}} hangs for some minutes before eventually failing; a superclass is trying to create an FS instance. Even if the db is local, a remote object store is expected to exist. If none is defined the test is skipped. But if one is declared, then it must be reachable. Proposed: {{s/TestS3GuardTool/r/ITestS3GuardTool/}} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14262) rpcTimeOut is not set up correctly in Client thus client doesn't time out
Yongjun Zhang created HADOOP-14262: -- Summary: rpcTimeOut is not set up correctly in Client thus client doesn't time out Key: HADOOP-14262 URL: https://issues.apache.org/jira/browse/HADOOP-14262 Project: Hadoop Common Issue Type: Bug Reporter: Yongjun Zhang Assignee: Yongjun Zhang NameNodeProxies.createNNProxyWithClientProtocol does {code} ClientNamenodeProtocolPB proxy = RPC.getProtocolProxy( ClientNamenodeProtocolPB.class, version, address, ugi, conf, NetUtils.getDefaultSocketFactory(conf), org.apache.hadoop.ipc.Client.getTimeout(conf), defaultPolicy, fallbackToSimpleAuth).getProxy(); {code} which calls Client.getTimeOut(conf) to get timeout value. Client.getTimeOut(conf) doesn't consider IPC_CLIENT_RPC_TIMEOUT_KEY right now. Thus rpcTimeOut doesn't take effect for relevant RPC calls, and they hang! For example, receiveRpcResponse blocked forever at: {code} Thread 16127: (state = BLOCKED) - sun.nio.ch.SocketChannelImpl.readerCleanup() @bci=6, line=279 (Compiled frame) - sun.nio.ch.SocketChannelImpl.read(java.nio.ByteBuffer) @bci=205, line=390 (Compiled frame) - org.apache.hadoop.net.SocketInputStream$Reader.performIO(java.nio.ByteBuffer) @bci=5, line=57 (Compiled frame) - org.apache.hadoop.net.SocketIOWithTimeout.doIO(java.nio.ByteBuffer, int) @bci=35, line=142 (Compiled frame) - org.apache.hadoop.net.SocketInputStream.read(java.nio.ByteBuffer) @bci=6, line=161 (Compiled frame) - org.apache.hadoop.net.SocketInputStream.read(byte[], int, int) @bci=7, line=131 (Compiled frame) - java.io.FilterInputStream.read(byte[], int, int) @bci=7, line=133 (Compiled frame) - java.io.FilterInputStream.read(byte[], int, int) @bci=7, line=133 (Compiled frame) - org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(byte[], int, int) @bci=4, line=521 (Compiled frame) - java.io.BufferedInputStream.fill() @bci=214, line=246 (Compiled frame) - java.io.BufferedInputStream.read() @bci=12, line=265 (Compiled frame) - java.io.DataInputStream.readInt() @bci=4, line=387 (Compiled frame) - org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse() @bci=19, line=1081 (Compiled frame) - org.apache.hadoop.ipc.Client$Connection.run() @bci=62, line=976 (Compiled frame) {code} Filing this jira to fix it. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org