[jira] [Commented] (HADOOP-17042) Hadoop distcp throws "ERROR: Tools helper ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not found"
[ https://issues.apache.org/jira/browse/HADOOP-17042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109887#comment-17109887 ] Akira Ajisaka commented on HADOOP-17042: I'm not sure we can remove hadoop_add_to_classpath_tools function because the function might be used by someone. Anyway, I'm +1 for your patch. Committing this. > Hadoop distcp throws "ERROR: Tools helper > ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not found" > - > > Key: HADOOP-17042 > URL: https://issues.apache.org/jira/browse/HADOOP-17042 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 3.2.1, 3.1.3 >Reporter: Aki Tanaka >Assignee: Aki Tanaka >Priority: Minor > Attachments: HADOOP-17042.patch > > > On Hadoop 3.x, we see following "ERROR: Tools helper > ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not found." message on > the first line of the command output when running Hadoop DistCp. > {code:java} > $ hadoop distcp /path/to/src /user/hadoop/ > ERROR: Tools helper ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not > found. > 2020-05-14 17:11:53,173 INFO tools.DistCp: Input Options: > DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, > ignoreFailures=false, overwrite=false, append=false, useDiff=false, > useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=false, > blocking=true > .. > {code} > This message was added by HADOOP-12857 and it would be an expected behavior. > DistCp calls 'hadoop_add_to_classpath_tools hadoop-distcp' when [it > starts|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh], > and the error is returned because the hadoop-distcp.sh does not exist in the > tools directory. > However, that error message confuses us. Since this is not an user end > configuration issue, I would think it's better to change the log level to > debug (hadoop_debug). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17042) Hadoop distcp throws "ERROR: Tools helper ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not found"
[ https://issues.apache.org/jira/browse/HADOOP-17042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109780#comment-17109780 ] Akira Ajisaka commented on HADOOP-17042: bq. I think we cannot move libexec/shellprofile.d/hadoop-distcp.sh to libexec/tools/hadoop-distcp.sh. You are right. If we do this, we cannot execute "hadoop distcp" command. > Hadoop distcp throws "ERROR: Tools helper > ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not found" > - > > Key: HADOOP-17042 > URL: https://issues.apache.org/jira/browse/HADOOP-17042 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 3.2.1, 3.1.3 >Reporter: Aki Tanaka >Assignee: Aki Tanaka >Priority: Minor > Attachments: HADOOP-17042.patch > > > On Hadoop 3.x, we see following "ERROR: Tools helper > ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not found." message on > the first line of the command output when running Hadoop DistCp. > {code:java} > $ hadoop distcp /path/to/src /user/hadoop/ > ERROR: Tools helper ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not > found. > 2020-05-14 17:11:53,173 INFO tools.DistCp: Input Options: > DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, > ignoreFailures=false, overwrite=false, append=false, useDiff=false, > useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=false, > blocking=true > .. > {code} > This message was added by HADOOP-12857 and it would be an expected behavior. > DistCp calls 'hadoop_add_to_classpath_tools hadoop-distcp' when [it > starts|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh], > and the error is returned because the hadoop-distcp.sh does not exist in the > tools directory. > However, that error message confuses us. Since this is not an user end > configuration issue, I would think it's better to change the log level to > debug (hadoop_debug). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17045) `Configuration` javadoc describes supporting environment variables, but the feature is not available
[ https://issues.apache.org/jira/browse/HADOOP-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HADOOP-17045: -- Fix Version/s: 2.9.3 > `Configuration` javadoc describes supporting environment variables, but the > feature is not available > > > Key: HADOOP-17045 > URL: https://issues.apache.org/jira/browse/HADOOP-17045 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0 >Reporter: Nick Dimiduk >Assignee: Masatake Iwasaki >Priority: Minor > Fix For: 2.9.3, 2.10.1 > > Attachments: HADOOP-17045-branch-2.10.001.patch > > > In Hadoop 2.10.0, the javadoc on the `Configuration` class describes the > ability to read values from environment variables. However, this feature > isn't implemented until HADOOP-9642, which shipped in 3.0.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17045) `Configuration` javadoc describes supporting environment variables, but the feature is not available
[ https://issues.apache.org/jira/browse/HADOOP-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109778#comment-17109778 ] Masatake Iwasaki commented on HADOOP-17045: --- Sure. I cherry-picked this to branch-2.9. > `Configuration` javadoc describes supporting environment variables, but the > feature is not available > > > Key: HADOOP-17045 > URL: https://issues.apache.org/jira/browse/HADOOP-17045 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0 >Reporter: Nick Dimiduk >Assignee: Masatake Iwasaki >Priority: Minor > Fix For: 2.10.1 > > Attachments: HADOOP-17045-branch-2.10.001.patch > > > In Hadoop 2.10.0, the javadoc on the `Configuration` class describes the > ability to read values from environment variables. However, this feature > isn't implemented until HADOOP-9642, which shipped in 3.0.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17045) `Configuration` javadoc describes supporting environment variables, but the feature is not available
[ https://issues.apache.org/jira/browse/HADOOP-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109765#comment-17109765 ] Akira Ajisaka commented on HADOOP-17045: Hi [~iwasakims], would you backport this to branch-2.9? > `Configuration` javadoc describes supporting environment variables, but the > feature is not available > > > Key: HADOOP-17045 > URL: https://issues.apache.org/jira/browse/HADOOP-17045 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0 >Reporter: Nick Dimiduk >Assignee: Masatake Iwasaki >Priority: Minor > Fix For: 2.10.1 > > Attachments: HADOOP-17045-branch-2.10.001.patch > > > In Hadoop 2.10.0, the javadoc on the `Configuration` class describes the > ability to read values from environment variables. However, this feature > isn't implemented until HADOOP-9642, which shipped in 3.0.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] liusheng commented on pull request #1983: YARN-9898. Workaround of Netty-all dependency aarch64 support
liusheng commented on pull request #1983: URL: https://github.com/apache/hadoop/pull/1983#issuecomment-629909447 @johnou yes, We have done the 4.1.50 upgrade in Hadoop, thank you :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17045) `Configuration` javadoc describes supporting environment variables, but the feature is not available
[ https://issues.apache.org/jira/browse/HADOOP-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HADOOP-17045: -- Fix Version/s: 2.10.1 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) committed to branch-2.10. Thanks, [~aajisaka]. > `Configuration` javadoc describes supporting environment variables, but the > feature is not available > > > Key: HADOOP-17045 > URL: https://issues.apache.org/jira/browse/HADOOP-17045 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0 >Reporter: Nick Dimiduk >Assignee: Masatake Iwasaki >Priority: Minor > Fix For: 2.10.1 > > Attachments: HADOOP-17045-branch-2.10.001.patch > > > In Hadoop 2.10.0, the javadoc on the `Configuration` class describes the > ability to read values from environment variables. However, this feature > isn't implemented until HADOOP-9642, which shipped in 3.0.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] iwasakims closed pull request #2024: HADOOP-17045. `Configuration` javadoc describes supporting environmen…
iwasakims closed pull request #2024: URL: https://github.com/apache/hadoop/pull/2024 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2025: HDFS-15322. Make NflyFS to work when ViewFsOverloadScheme's scheme and target uris schemes are same.
hadoop-yetus commented on pull request #2025: URL: https://github.com/apache/hadoop/pull/2025#issuecomment-629905421 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 14s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 3 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 0m 21s | Maven dependency ordering for branch | | -1 :x: | mvninstall | 21m 57s | root in trunk failed. | | +1 :green_heart: | compile | 18m 1s | trunk passed | | +1 :green_heart: | checkstyle | 2m 51s | trunk passed | | +1 :green_heart: | mvnsite | 2m 44s | trunk passed | | -1 :x: | shadedclient | 22m 22s | branch has errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 43s | trunk passed | | +0 :ok: | spotbugs | 3m 8s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 5m 10s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 21s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 59s | the patch passed | | +1 :green_heart: | compile | 17m 21s | the patch passed | | +1 :green_heart: | javac | 17m 21s | the patch passed | | -0 :warning: | checkstyle | 2m 54s | root: The patch generated 1 new + 126 unchanged - 0 fixed = 127 total (was 126) | | +1 :green_heart: | mvnsite | 2m 44s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | -1 :x: | shadedclient | 16m 16s | patch has errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 43s | the patch passed | | +1 :green_heart: | findbugs | 5m 32s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 27s | hadoop-common in the patch passed. | | -1 :x: | unit | 116m 53s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 52s | The patch does not generate ASF License warnings. | | | | 250m 25s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2025 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7936941687ab 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / a3809d20230 | | Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/2/artifact/out/branch-mvninstall-root.txt | | checkstyle | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/2/artifact/out/diff-checkstyle-root.txt | | unit | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/2/testReport/ | | Max. process+thread count | 2884 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: . | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/2/console | | versions | git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17045) `Configuration` javadoc describes supporting environment variables, but the feature is not available
[ https://issues.apache.org/jira/browse/HADOOP-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109726#comment-17109726 ] Akira Ajisaka commented on HADOOP-17045: +1, thanks [~ndimiduk] and [~iwasakims]. > `Configuration` javadoc describes supporting environment variables, but the > feature is not available > > > Key: HADOOP-17045 > URL: https://issues.apache.org/jira/browse/HADOOP-17045 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.10.0 >Reporter: Nick Dimiduk >Assignee: Masatake Iwasaki >Priority: Minor > Attachments: HADOOP-17045-branch-2.10.001.patch > > > In Hadoop 2.10.0, the javadoc on the `Configuration` class describes the > ability to read values from environment variables. However, this feature > isn't implemented until HADOOP-9642, which shipped in 3.0.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] leosunli commented on a change in pull request #1885: HDFS-13639. SlotReleaser is not fast enough
leosunli commented on a change in pull request #1885: URL: https://github.com/apache/hadoop/pull/1885#discussion_r426278495 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java ## @@ -909,4 +910,90 @@ public void testRequestFileDescriptorsWhenULimit() throws Exception { } } } + + @Test + public void testDomainSocketClosedByDN() throws Exception { +BlockReaderTestUtil.enableShortCircuitShmTracing(); +TemporarySocketDirectory sockDir = new TemporarySocketDirectory(); +Configuration conf = +createShortCircuitConf("testDomainSocketClosedByDN", sockDir); +MiniDFSCluster cluster = +new MiniDFSCluster.Builder(conf).numDataNodes(1).build(); +cluster.waitActive(); +DistributedFileSystem fs = cluster.getFileSystem(); +final ShortCircuitCache cache = +fs.getClient().getClientContext().getShortCircuitCache(); +DomainPeer peer = getDomainPeerToDn(conf); +MutableBoolean usedPeer = new MutableBoolean(false); +ExtendedBlockId blockId = new ExtendedBlockId(123, "xyz"); +final DatanodeInfo datanode = new DatanodeInfo.DatanodeInfoBuilder() +.setNodeID(cluster.getDataNodes().get(0).getDatanodeId()).build(); +// Allocating the first shm slot requires using up a peer. +Slot slot1 = cache.allocShmSlot(datanode, peer, usedPeer, blockId, +"testReleaseSlotReuseDomainSocket_client"); + +cluster.getDataNodes().get(0).getShortCircuitRegistry() +.registerSlot(blockId, slot1.getSlotId(), false); + +Slot slot2 = cache.allocShmSlot(datanode, peer, usedPeer, blockId, +"testReleaseSlotReuseDomainSocket_client"); + +cluster.getDataNodes().get(0).getShortCircuitRegistry() +.registerSlot(blockId, slot2.getSlotId(), false); + +cache.scheduleSlotReleaser(slot1); + +// make the DataXceiver timedout +Thread.sleep(5000); +cache.scheduleSlotReleaser(slot2); +Thread.sleep(1); +Assert.assertTrue(cluster.getDataNodes().get(0).getShortCircuitRegistry() +.getShmNum() == 0); +Assert.assertTrue(cache.getDfsClientShmManager().getShmNum() == 0); +cluster.shutdown(); + } + + @Test + public void testDNRestart() throws Exception { +BlockReaderTestUtil.enableShortCircuitShmTracing(); +TemporarySocketDirectory sockDir = new TemporarySocketDirectory(); +Configuration conf = createShortCircuitConf("testDNRestart", sockDir); +MiniDFSCluster cluster = Review comment: the mini cluster is required. i use the mini cluster in a lot ut. Would you have any good way to replace it? ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java ## @@ -909,4 +910,90 @@ public void testRequestFileDescriptorsWhenULimit() throws Exception { } } } + + @Test + public void testDomainSocketClosedByDN() throws Exception { +BlockReaderTestUtil.enableShortCircuitShmTracing(); +TemporarySocketDirectory sockDir = new TemporarySocketDirectory(); +Configuration conf = +createShortCircuitConf("testDomainSocketClosedByDN", sockDir); +MiniDFSCluster cluster = +new MiniDFSCluster.Builder(conf).numDataNodes(1).build(); +cluster.waitActive(); +DistributedFileSystem fs = cluster.getFileSystem(); +final ShortCircuitCache cache = +fs.getClient().getClientContext().getShortCircuitCache(); +DomainPeer peer = getDomainPeerToDn(conf); +MutableBoolean usedPeer = new MutableBoolean(false); +ExtendedBlockId blockId = new ExtendedBlockId(123, "xyz"); +final DatanodeInfo datanode = new DatanodeInfo.DatanodeInfoBuilder() +.setNodeID(cluster.getDataNodes().get(0).getDatanodeId()).build(); +// Allocating the first shm slot requires using up a peer. +Slot slot1 = cache.allocShmSlot(datanode, peer, usedPeer, blockId, +"testReleaseSlotReuseDomainSocket_client"); + +cluster.getDataNodes().get(0).getShortCircuitRegistry() +.registerSlot(blockId, slot1.getSlotId(), false); + +Slot slot2 = cache.allocShmSlot(datanode, peer, usedPeer, blockId, +"testReleaseSlotReuseDomainSocket_client"); + +cluster.getDataNodes().get(0).getShortCircuitRegistry() +.registerSlot(blockId, slot2.getSlotId(), false); + +cache.scheduleSlotReleaser(slot1); + +// make the DataXceiver timedout +Thread.sleep(5000); +cache.scheduleSlotReleaser(slot2); +Thread.sleep(1); Review comment: done ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java ## @@ -909,4 +910,90 @@ public void testRequestFileDescriptorsWhenULimit() throws Exception { } } } + + @Test Review comment: done ## File path:
[GitHub] [hadoop] leosunli commented on a change in pull request #1885: HDFS-13639. SlotReleaser is not fast enough
leosunli commented on a change in pull request #1885: URL: https://github.com/apache/hadoop/pull/1885#discussion_r426278294 ## File path: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/ShortCircuitCache.java ## @@ -181,25 +182,49 @@ public long getRateInMs() { @Override public void run() { + if (slot == null) { +return; + } LOG.trace("{}: about to release {}", ShortCircuitCache.this, slot); final DfsClientShm shm = (DfsClientShm)slot.getShm(); final DomainSocket shmSock = shm.getPeer().getDomainSocket(); final String path = shmSock.getPath(); + DataOutputStream out = null; boolean success = false; - try (DomainSocket sock = DomainSocket.connect(path); - DataOutputStream out = new DataOutputStream( - new BufferedOutputStream(sock.getOutputStream( { -new Sender(out).releaseShortCircuitFds(slot.getSlotId()); -DataInputStream in = new DataInputStream(sock.getInputStream()); -ReleaseShortCircuitAccessResponseProto resp = -ReleaseShortCircuitAccessResponseProto.parseFrom( -PBHelperClient.vintPrefixed(in)); -if (resp.getStatus() != Status.SUCCESS) { - String error = resp.hasError() ? resp.getError() : "(unknown)"; - throw new IOException(resp.getStatus().toString() + ": " + error); + int retries = 2; + try { +while (retries > 0) { + try { +if (domainSocket == null || !domainSocket.isOpen()) { + // we are running in single thread mode, no protection needed for Review comment: yeah ShortCircuitCache#unref->ShortCircuitReplica#close->cache.scheduleSlotReleaser(slot)->releaserExecutor.execute(new SlotReleaser(slot))->SlotReleaser#run { ... if (domainSocket == null || !domainSocket.isOpen()) { // we are running in single thread mode, no protection needed for // domainSocket domainSocket = DomainSocket.connect(path); } ... } Since ShortCircuitCache#unref run in lock, this code is run serially in a thread of a thread pool. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] leosunli commented on a change in pull request #1885: HDFS-13639. SlotReleaser is not fast enough
leosunli commented on a change in pull request #1885: URL: https://github.com/apache/hadoop/pull/1885#discussion_r426278294 ## File path: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/ShortCircuitCache.java ## @@ -181,25 +182,49 @@ public long getRateInMs() { @Override public void run() { + if (slot == null) { +return; + } LOG.trace("{}: about to release {}", ShortCircuitCache.this, slot); final DfsClientShm shm = (DfsClientShm)slot.getShm(); final DomainSocket shmSock = shm.getPeer().getDomainSocket(); final String path = shmSock.getPath(); + DataOutputStream out = null; boolean success = false; - try (DomainSocket sock = DomainSocket.connect(path); - DataOutputStream out = new DataOutputStream( - new BufferedOutputStream(sock.getOutputStream( { -new Sender(out).releaseShortCircuitFds(slot.getSlotId()); -DataInputStream in = new DataInputStream(sock.getInputStream()); -ReleaseShortCircuitAccessResponseProto resp = -ReleaseShortCircuitAccessResponseProto.parseFrom( -PBHelperClient.vintPrefixed(in)); -if (resp.getStatus() != Status.SUCCESS) { - String error = resp.hasError() ? resp.getError() : "(unknown)"; - throw new IOException(resp.getStatus().toString() + ": " + error); + int retries = 2; + try { +while (retries > 0) { + try { +if (domainSocket == null || !domainSocket.isOpen()) { + // we are running in single thread mode, no protection needed for Review comment: ShortCircuitCache#unref->ShortCircuitReplica#close->cache.scheduleSlotReleaser(slot)->releaserExecutor.execute(new SlotReleaser(slot))->SlotReleaser#run { ... if (domainSocket == null || !domainSocket.isOpen()) { // we are running in single thread mode, no protection needed for // domainSocket domainSocket = DomainSocket.connect(path); } ... } Since ShortCircuitCache#unref run in lock, this code is run serially in a thread of a thread pool. ## File path: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/ShortCircuitCache.java ## @@ -181,25 +182,49 @@ public long getRateInMs() { @Override public void run() { + if (slot == null) { +return; + } LOG.trace("{}: about to release {}", ShortCircuitCache.this, slot); final DfsClientShm shm = (DfsClientShm)slot.getShm(); final DomainSocket shmSock = shm.getPeer().getDomainSocket(); final String path = shmSock.getPath(); + DataOutputStream out = null; boolean success = false; - try (DomainSocket sock = DomainSocket.connect(path); - DataOutputStream out = new DataOutputStream( - new BufferedOutputStream(sock.getOutputStream( { -new Sender(out).releaseShortCircuitFds(slot.getSlotId()); -DataInputStream in = new DataInputStream(sock.getInputStream()); -ReleaseShortCircuitAccessResponseProto resp = -ReleaseShortCircuitAccessResponseProto.parseFrom( -PBHelperClient.vintPrefixed(in)); -if (resp.getStatus() != Status.SUCCESS) { - String error = resp.hasError() ? resp.getError() : "(unknown)"; - throw new IOException(resp.getStatus().toString() + ": " + error); + int retries = 2; + try { +while (retries > 0) { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] leosunli commented on a change in pull request #1885: HDFS-13639. SlotReleaser is not fast enough
leosunli commented on a change in pull request #1885: URL: https://github.com/apache/hadoop/pull/1885#discussion_r426278349 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java ## @@ -909,4 +910,90 @@ public void testRequestFileDescriptorsWhenULimit() throws Exception { } } } + + @Test + public void testDomainSocketClosedByDN() throws Exception { +BlockReaderTestUtil.enableShortCircuitShmTracing(); +TemporarySocketDirectory sockDir = new TemporarySocketDirectory(); +Configuration conf = +createShortCircuitConf("testDomainSocketClosedByDN", sockDir); +MiniDFSCluster cluster = +new MiniDFSCluster.Builder(conf).numDataNodes(1).build(); +cluster.waitActive(); +DistributedFileSystem fs = cluster.getFileSystem(); +final ShortCircuitCache cache = +fs.getClient().getClientContext().getShortCircuitCache(); +DomainPeer peer = getDomainPeerToDn(conf); +MutableBoolean usedPeer = new MutableBoolean(false); +ExtendedBlockId blockId = new ExtendedBlockId(123, "xyz"); +final DatanodeInfo datanode = new DatanodeInfo.DatanodeInfoBuilder() +.setNodeID(cluster.getDataNodes().get(0).getDatanodeId()).build(); +// Allocating the first shm slot requires using up a peer. +Slot slot1 = cache.allocShmSlot(datanode, peer, usedPeer, blockId, +"testReleaseSlotReuseDomainSocket_client"); + +cluster.getDataNodes().get(0).getShortCircuitRegistry() +.registerSlot(blockId, slot1.getSlotId(), false); + +Slot slot2 = cache.allocShmSlot(datanode, peer, usedPeer, blockId, +"testReleaseSlotReuseDomainSocket_client"); + +cluster.getDataNodes().get(0).getShortCircuitRegistry() +.registerSlot(blockId, slot2.getSlotId(), false); + +cache.scheduleSlotReleaser(slot1); + +// make the DataXceiver timedout +Thread.sleep(5000); +cache.scheduleSlotReleaser(slot2); +Thread.sleep(1); +Assert.assertTrue(cluster.getDataNodes().get(0).getShortCircuitRegistry() +.getShmNum() == 0); +Assert.assertTrue(cache.getDfsClientShmManager().getShmNum() == 0); +cluster.shutdown(); + } + + @Test + public void testDNRestart() throws Exception { +BlockReaderTestUtil.enableShortCircuitShmTracing(); +TemporarySocketDirectory sockDir = new TemporarySocketDirectory(); +Configuration conf = createShortCircuitConf("testDNRestart", sockDir); +MiniDFSCluster cluster = +new MiniDFSCluster.Builder(conf).numDataNodes(1).build(); +cluster.waitActive(); +DistributedFileSystem fs = cluster.getFileSystem(); +final ShortCircuitCache cache = +fs.getClient().getClientContext().getShortCircuitCache(); +DomainPeer peer = getDomainPeerToDn(conf); +MutableBoolean usedPeer = new MutableBoolean(false); +ExtendedBlockId blockId = new ExtendedBlockId(123, "xyz"); +final DatanodeInfo datanode = new DatanodeInfo.DatanodeInfoBuilder() +.setNodeID(cluster.getDataNodes().get(0).getDatanodeId()).build(); +// Allocating the first shm slot requires using up a peer. +Slot slot1 = cache.allocShmSlot(datanode, peer, usedPeer, blockId, +"testReleaseSlotReuseDomainSocket_client"); + +cluster.getDataNodes().get(0).getShortCircuitRegistry() +.registerSlot(blockId, slot1.getSlotId(), false); + +// restart the datanode to invalidate the cache +cluster.restartDataNode(0); +Thread.sleep(1000); +// after the restart, new allocation and release should not be affect +cache.scheduleSlotReleaser(slot1); + +Slot slot2 = null; +try { + slot2 = cache.allocShmSlot(datanode, peer, usedPeer, blockId, + "testReleaseSlotReuseDomainSocket_client"); +} catch (ClosedChannelException ce) { + +} +cache.scheduleSlotReleaser(slot2); +Thread.sleep(2000); +Assert.assertTrue(cluster.getDataNodes().get(0).getShortCircuitRegistry() +.getShmNum() == 0); +Assert.assertTrue(cache.getDfsClientShmManager().getShmNum() == 0); Review comment: done ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java ## @@ -909,4 +910,90 @@ public void testRequestFileDescriptorsWhenULimit() throws Exception { } } } + + @Test + public void testDomainSocketClosedByDN() throws Exception { +BlockReaderTestUtil.enableShortCircuitShmTracing(); +TemporarySocketDirectory sockDir = new TemporarySocketDirectory(); +Configuration conf = +createShortCircuitConf("testDomainSocketClosedByDN", sockDir); +MiniDFSCluster cluster = +new MiniDFSCluster.Builder(conf).numDataNodes(1).build(); +cluster.waitActive(); +DistributedFileSystem fs = cluster.getFileSystem(); +final ShortCircuitCache cache = +fs.getClient().getClientContext().getShortCircuitCache(); +DomainPeer peer =
[GitHub] [hadoop] leosunli commented on a change in pull request #1885: HDFS-13639. SlotReleaser is not fast enough
leosunli commented on a change in pull request #1885: URL: https://github.com/apache/hadoop/pull/1885#discussion_r426277924 ## File path: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/ShortCircuitCache.java ## @@ -181,25 +182,49 @@ public long getRateInMs() { @Override public void run() { + if (slot == null) { +return; + } LOG.trace("{}: about to release {}", ShortCircuitCache.this, slot); final DfsClientShm shm = (DfsClientShm)slot.getShm(); final DomainSocket shmSock = shm.getPeer().getDomainSocket(); final String path = shmSock.getPath(); + DataOutputStream out = null; boolean success = false; - try (DomainSocket sock = DomainSocket.connect(path); - DataOutputStream out = new DataOutputStream( - new BufferedOutputStream(sock.getOutputStream( { -new Sender(out).releaseShortCircuitFds(slot.getSlotId()); -DataInputStream in = new DataInputStream(sock.getInputStream()); -ReleaseShortCircuitAccessResponseProto resp = -ReleaseShortCircuitAccessResponseProto.parseFrom( -PBHelperClient.vintPrefixed(in)); -if (resp.getStatus() != Status.SUCCESS) { - String error = resp.hasError() ? resp.getError() : "(unknown)"; - throw new IOException(resp.getStatus().toString() + ": " + error); + int retries = 2; + try { +while (retries > 0) { + try { +if (domainSocket == null || !domainSocket.isOpen()) { + // we are running in single thread mode, no protection needed for Review comment: yeah, the SlotReleaser is running in single thread as follow code. ShortCircuitCache#uref->replica.close()->cache.scheduleSlotReleaser(slot)->releaserExecutor.execute(new SlotReleaser(slot))-> SlotReleaser#run { ... try { if (domainSocket == null || !domainSocket.isOpen()) { // we are running in single thread mode, no protection needed for // domainSocket domainSocket = DomainSocket.connect(path); } ... } Since ShortCircuitCache#uref run in lock,this code is run serially in a core thread of a thread pool. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on pull request #2025: HDFS-15322. Make NflyFS to work when ViewFsOverloadScheme's scheme and target uris schemes are same.
hadoop-yetus commented on pull request #2025: URL: https://github.com/apache/hadoop/pull/2025#issuecomment-629779612 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 39s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 3 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 1m 17s | Maven dependency ordering for branch | | -1 :x: | mvninstall | 23m 11s | root in trunk failed. | | +1 :green_heart: | compile | 22m 4s | trunk passed | | +1 :green_heart: | checkstyle | 3m 0s | trunk passed | | +1 :green_heart: | mvnsite | 3m 11s | trunk passed | | -1 :x: | shadedclient | 23m 9s | branch has errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 49s | trunk passed | | +0 :ok: | spotbugs | 3m 16s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 5m 36s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 26s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 11s | the patch passed | | +1 :green_heart: | compile | 18m 15s | the patch passed | | +1 :green_heart: | javac | 18m 15s | the patch passed | | -0 :warning: | checkstyle | 2m 53s | root: The patch generated 3 new + 126 unchanged - 0 fixed = 129 total (was 126) | | +1 :green_heart: | mvnsite | 2m 50s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | -1 :x: | shadedclient | 16m 14s | patch has errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 46s | the patch passed | | +1 :green_heart: | findbugs | 5m 36s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 9m 31s | hadoop-common in the patch passed. | | -1 :x: | unit | 113m 1s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 54s | The patch does not generate ASF License warnings. | | | | 256m 8s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.fs.viewfs.TestViewFsTrash | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2025 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 63798a6f64d4 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 6e416a83d1e | | Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/1/artifact/out/branch-mvninstall-root.txt | | checkstyle | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/1/artifact/out/diff-checkstyle-root.txt | | unit | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt | | unit | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/1/testReport/ | | Max. process+thread count | 3064 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: . | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-2025/1/console | | versions | git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org -
[GitHub] [hadoop] umamaheswararao opened a new pull request #2025: HDFS-15322. Make NflyFS to work when ViewFsOverloadScheme's scheme and target uris schemes are same.
umamaheswararao opened a new pull request #2025: URL: https://github.com/apache/hadoop/pull/2025 [HDFS-15322. Make NflyFS to work when ViewFsOverloadScheme's scheme and target uris schemes are same.](https://issues.apache.org/jira/browse/HDFS-15322) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org