[jira] [Work logged] (HDFS-16361) Fix log format for QueryCommand
[ https://issues.apache.org/jira/browse/HDFS-16361?focusedWorklogId=688474=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688474 ] ASF GitHub Bot logged work on HDFS-16361: - Author: ASF GitHub Bot Created on: 01/Dec/21 07:54 Start Date: 01/Dec/21 07:54 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3732: URL: https://github.com/apache/hadoop/pull/3732#issuecomment-983379258 Thanks @ayushtkn for your review and merge. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688474) Time Spent: 2h 10m (was: 2h) > Fix log format for QueryCommand > --- > > Key: HDFS-16361 > URL: https://issues.apache.org/jira/browse/HDFS-16361 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: tomscut >Assignee: tomscut >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Fix log format for QueryCommand of disk balancer. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16361) Fix log format for QueryCommand
[ https://issues.apache.org/jira/browse/HDFS-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16361. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Committed to trunk. Thnax [~tomscut] for the contribution!!! > Fix log format for QueryCommand > --- > > Key: HDFS-16361 > URL: https://issues.apache.org/jira/browse/HDFS-16361 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: tomscut >Assignee: tomscut >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Fix log format for QueryCommand of disk balancer. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16361) Fix log format for QueryCommand
[ https://issues.apache.org/jira/browse/HDFS-16361?focusedWorklogId=688464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688464 ] ASF GitHub Bot logged work on HDFS-16361: - Author: ASF GitHub Bot Created on: 01/Dec/21 07:19 Start Date: 01/Dec/21 07:19 Worklog Time Spent: 10m Work Description: ayushtkn commented on pull request #3732: URL: https://github.com/apache/hadoop/pull/3732#issuecomment-983359386 Merged. Thanx -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688464) Time Spent: 2h (was: 1h 50m) > Fix log format for QueryCommand > --- > > Key: HDFS-16361 > URL: https://issues.apache.org/jira/browse/HDFS-16361 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: tomscut >Assignee: tomscut >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h > Remaining Estimate: 0h > > Fix log format for QueryCommand of disk balancer. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16359) RBF: RouterRpcServer#invokeAtAvailableNs does not take effect when retrying
[ https://issues.apache.org/jira/browse/HDFS-16359?focusedWorklogId=688461=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688461 ] ASF GitHub Bot logged work on HDFS-16359: - Author: ASF GitHub Bot Created on: 01/Dec/21 07:18 Start Date: 01/Dec/21 07:18 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3731: URL: https://github.com/apache/hadoop/pull/3731#issuecomment-983358171 Thank you @goiri very much for your review again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688461) Time Spent: 3h (was: 2h 50m) > RBF: RouterRpcServer#invokeAtAvailableNs does not take effect when retrying > --- > > Key: HDFS-16359 > URL: https://issues.apache.org/jira/browse/HDFS-16359 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > RouterRpcServer#invokeAtAvailableNs does not take effect when retrying. See > HDFS-15543. > The original code of RouterRpcServer#getNameSpaceInfo looks like this: > {code:java} > private Set getNameSpaceInfo(String nsId) { > Set namespaceInfos = new HashSet<>(); > for (FederationNamespaceInfo ns : namespaceInfos) { > if (!nsId.equals(ns.getNameserviceId())) { > namespaceInfos.add(ns); > } > } > return namespaceInfos; > } {code} > And _namespaceInfos_ is always empty here. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16361) Fix log format for QueryCommand
[ https://issues.apache.org/jira/browse/HDFS-16361?focusedWorklogId=688462=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688462 ] ASF GitHub Bot logged work on HDFS-16361: - Author: ASF GitHub Bot Created on: 01/Dec/21 07:18 Start Date: 01/Dec/21 07:18 Worklog Time Spent: 10m Work Description: ayushtkn merged pull request #3732: URL: https://github.com/apache/hadoop/pull/3732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688462) Time Spent: 1h 50m (was: 1h 40m) > Fix log format for QueryCommand > --- > > Key: HDFS-16361 > URL: https://issues.apache.org/jira/browse/HDFS-16361 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: tomscut >Assignee: tomscut >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Fix log format for QueryCommand of disk balancer. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
[ https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451558#comment-17451558 ] Zhen Wang edited comment on HDFS-16363 at 12/1/21, 7:15 AM: DEBUG org.apache.hadoop.util.DiskChecker#checkDirInternal(java.io.File): !image-2021-12-01-15-09-54-965.png! org.apache.hadoop.fs.LocalDirAllocator.AllocatorPerContext#createPath: !image-2021-12-01-15-07-42-432.png! org.apache.hadoop.io.SequenceFile.Sorter.MergeQueue#merge: !image-2021-12-01-15-14-25-549.png! was (Author: wforget): DEBUG org.apache.hadoop.util.DiskChecker#checkDirInternal(java.io.File): !image-2021-12-01-15-09-54-965.png! org.apache.hadoop.fs.LocalDirAllocator.AllocatorPerContext#createPath: !image-2021-12-01-15-07-42-432.png! > An exception occurs in the distcp task of a large number of files, when > yarn.app.mapreduce.am.staging-dir is set to the hdfs path. > -- > > Key: HDFS-16363 > URL: https://issues.apache.org/jira/browse/HDFS-16363 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 3.2.2 >Reporter: Zhen Wang >Priority: Major > Attachments: image-2021-12-01-15-07-42-432.png, > image-2021-12-01-15-09-54-965.png, image-2021-12-01-15-14-25-549.png > > > An exception occurs in the distcp task of a large number of files, when > yarn.app.mapreduce.am.staging-dir is set to the hdfs path. > > task log: > {code:java} > 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = > 24631997; dirCnt = 1750444 > 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed. > 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. > Instead, use mapreduce.task.io.sort.mb > 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is > deprecated. Instead, use mapreduce.task.io.sort.factor > 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error > Exception: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create > directory: /system/mapred/XX/.staging/_distcp-260350640 > at > org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77) > at > org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32) > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367) > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > at > org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) > at > org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) > at > org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) > at > org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) > at > org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) > at > org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) > at > org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) > at > org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) > at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) > at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:153) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:441) > 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any > valid local directory for hdfs://rbf-XX/system/mapred/XX/ > .staging/_distcp-260350640/intermediate.1 > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463) > at >
[jira] [Updated] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
[ https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhen Wang updated HDFS-16363: - Description: An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path. task log: {code:java} 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 24631997; dirCnt = 1750444 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed. 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error Exception: org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: /system/mapred/XX/.staging/_distcp-260350640 at org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77) at org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) at org.apache.hadoop.tools.DistCp.run(DistCp.java:153) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.tools.DistCp.main(DistCp.java:441) 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for hdfs://rbf-XX/system/mapred/XX/ .staging/_distcp-260350640/intermediate.1 at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) at
[jira] [Comment Edited] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
[ https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451558#comment-17451558 ] Zhen Wang edited comment on HDFS-16363 at 12/1/21, 7:09 AM: DEBUG org.apache.hadoop.util.DiskChecker#checkDirInternal(java.io.File): !image-2021-12-01-15-09-54-965.png! org.apache.hadoop.fs.LocalDirAllocator.AllocatorPerContext#createPath: !image-2021-12-01-15-07-42-432.png! was (Author: wforget): DEBUG org.apache.hadoop.fs.LocalDirAllocator.AllocatorPerContext#createPath: !image-2021-12-01-15-07-42-432.png! > An exception occurs in the distcp task of a large number of files, when > yarn.app.mapreduce.am.staging-dir is set to the hdfs path. > -- > > Key: HDFS-16363 > URL: https://issues.apache.org/jira/browse/HDFS-16363 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 3.2.2 >Reporter: Zhen Wang >Priority: Major > Attachments: image-2021-12-01-15-07-42-432.png, > image-2021-12-01-15-09-54-965.png > > > An exception occurs in the distcp task of a large number of files, when > yarn.app.mapreduce.am.staging-dir is set to the hdfs path. > > task log: > {code:java} > 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = > 24631997; dirCnt = 1750444 > 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed. > 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. > Instead, use mapreduce.task.io.sort.mb > 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is > deprecated. Instead, use mapreduce.task.io.sort.factor > 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error > Exception: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create > directory: /system/mapred/aa/.staging/_distcp-260350640 > at > org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77) > at > org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32) > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367) > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > at > org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) > at > org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) > at > org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) > at > org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) > at > org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) > at > org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) > at > org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) > at > org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) > at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) > at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:153) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:441) > 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any > valid local directory for hdfs://rbf-XX/system/mapred/aa/ > .staging/_distcp-260350640/intermediate.1 > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > at >
[jira] [Commented] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
[ https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451558#comment-17451558 ] Zhen Wang commented on HDFS-16363: -- DEBUG org.apache.hadoop.fs.LocalDirAllocator.AllocatorPerContext#createPath: !image-2021-12-01-15-07-42-432.png! > An exception occurs in the distcp task of a large number of files, when > yarn.app.mapreduce.am.staging-dir is set to the hdfs path. > -- > > Key: HDFS-16363 > URL: https://issues.apache.org/jira/browse/HDFS-16363 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 3.2.2 >Reporter: Zhen Wang >Priority: Major > Attachments: image-2021-12-01-15-07-42-432.png > > > An exception occurs in the distcp task of a large number of files, when > yarn.app.mapreduce.am.staging-dir is set to the hdfs path. > > task log: > {code:java} > 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = > 24631997; dirCnt = 1750444 > 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed. > 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. > Instead, use mapreduce.task.io.sort.mb > 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is > deprecated. Instead, use mapreduce.task.io.sort.factor > 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error > Exception: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create > directory: /system/mapred/aa/.staging/_distcp-260350640 > at > org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77) > at > org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32) > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367) > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > at > org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) > at > org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) > at > org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) > at > org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) > at > org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) > at > org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) > at > org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) > at > org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) > at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) > at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:153) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:441) > 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any > valid local directory for hdfs://rbf-XX/system/mapred/aa/ > .staging/_distcp-260350640/intermediate.1 > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > at > org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) > at > org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) > at > org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) > at >
[jira] [Updated] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
[ https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhen Wang updated HDFS-16363: - Attachment: image-2021-12-01-15-07-42-432.png > An exception occurs in the distcp task of a large number of files, when > yarn.app.mapreduce.am.staging-dir is set to the hdfs path. > -- > > Key: HDFS-16363 > URL: https://issues.apache.org/jira/browse/HDFS-16363 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 3.2.2 >Reporter: Zhen Wang >Priority: Major > Attachments: image-2021-12-01-15-07-42-432.png > > > An exception occurs in the distcp task of a large number of files, when > yarn.app.mapreduce.am.staging-dir is set to the hdfs path. > > task log: > {code:java} > 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = > 24631997; dirCnt = 1750444 > 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed. > 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. > Instead, use mapreduce.task.io.sort.mb > 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is > deprecated. Instead, use mapreduce.task.io.sort.factor > 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error > Exception: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create > directory: /system/mapred/aa/.staging/_distcp-260350640 > at > org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77) > at > org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32) > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367) > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > at > org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) > at > org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) > at > org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) > at > org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) > at > org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) > at > org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) > at > org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) > at > org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) > at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) > at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:153) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:441) > 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any > valid local directory for hdfs://rbf-XX/system/mapred/aa/ > .staging/_distcp-260350640/intermediate.1 > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > at > org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) > at > org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) > at > org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) > at > org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) > at >
[jira] [Updated] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
[ https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhen Wang updated HDFS-16363: - Description: An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path. task log: {code:java} 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 24631997; dirCnt = 1750444 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed. 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error Exception: org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: /system/mapred/aa/.staging/_distcp-260350640 at org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77) at org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) at org.apache.hadoop.tools.DistCp.run(DistCp.java:153) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.tools.DistCp.main(DistCp.java:441) 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for hdfs://rbf-XX/system/mapred/aa/ .staging/_distcp-260350640/intermediate.1 at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) at
[jira] [Updated] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
[ https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhen Wang updated HDFS-16363: - Description: An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path. task log: {code:java} 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 24631997; dirCnt = 1750444 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed. 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error Exception: org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: /system/mapred/aa/.staging/_distcp-260350640 at org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77) at org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) at org.apache.hadoop.tools.DistCp.run(DistCp.java:153) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.tools.DistCp.main(DistCp.java:441) 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for hdfs://rbf-XX/system/mapred/aa/ .staging/_distcp-260350640/intermediate.1 at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) at
[jira] [Updated] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
[ https://issues.apache.org/jira/browse/HDFS-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhen Wang updated HDFS-16363: - Description: An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path. error: {code:java} 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 24631997; dirCnt = 1750444 21/12/01 13:56:04 INFO tools.SimpleCopyListing: Build file listing completed. 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 21/12/01 13:56:04 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 21/12/01 13:57:57 WARN fs.LocalDirAllocator$AllocatorPerContext: Disk Error Exception: org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: /system/mapred/aa/.staging/_distcp-260350640 at org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:98) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:77) at org.apache.hadoop.util.BasicDiskValidator.checkStatus(BasicDiskValidator.java:32) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:367) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:447) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) at org.apache.hadoop.tools.DistCp.run(DistCp.java:153) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.tools.DistCp.main(DistCp.java:441) 21/12/01 13:57:57 ERROR tools.DistCp: Exception encountered org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for hdfs://rbf-XX/system/mapred/aa/ .staging/_distcp-260350640/intermediate.1 at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:463) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:3549) at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:3343) at org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:3319) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2882) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2921) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:476) at org.apache.hadoop.tools.util.DistCpUtils.sortListing(DistCpUtils.java:450) at org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:155) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:93) at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:89) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368) at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96) at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182) at
[jira] [Created] (HDFS-16363) An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path.
Zhen Wang created HDFS-16363: Summary: An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path. Key: HDFS-16363 URL: https://issues.apache.org/jira/browse/HDFS-16363 Project: Hadoop HDFS Issue Type: Bug Components: distcp Affects Versions: 3.2.2 Reporter: Zhen Wang An exception occurs in the distcp task of a large number of files, when yarn.app.mapreduce.am.staging-dir is set to the hdfs path. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16354) Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc
[ https://issues.apache.org/jira/browse/HDFS-16354?focusedWorklogId=688422=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688422 ] ASF GitHub Bot logged work on HDFS-16354: - Author: ASF GitHub Bot Created on: 01/Dec/21 04:49 Start Date: 01/Dec/21 04:49 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3740: URL: https://github.com/apache/hadoop/pull/3740#issuecomment-983285933 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 51s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 46m 18s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 29s | | trunk passed | | +1 :green_heart: | shadedclient | 70m 6s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 18s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | mvnsite | 1m 18s | | the patch passed | | +1 :green_heart: | shadedclient | 23m 35s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | asflicense | 0m 29s | | The patch does not generate ASF License warnings. | | | | 97m 42s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3740/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3740 | | Optional Tests | dupname asflicense mvnsite codespell markdownlint | | uname | Linux eb10a5da2d2e 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 5526f16df575bf936decf2fe85a8c3d157120caf | | Max. process+thread count | 522 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3740/1/console | | versions | git=2.25.1 maven=3.6.3 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688422) Time Spent: 20m (was: 10m) > Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc > > > Key: HDFS-16354 > URL: https://issues.apache.org/jira/browse/HDFS-16354 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > HDFS-16091 added GETSNAPSHOTDIFFLISTING op leveraging > ClientProtocol#getSnapshotDiffReportListing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16293) Client sleeps and holds 'dataQueue' when DataNodes are congested
[ https://issues.apache.org/jira/browse/HDFS-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanxin Zhu updated HDFS-16293: --- Attachment: HDFS-16293.03.patch > Client sleeps and holds 'dataQueue' when DataNodes are congested > > > Key: HDFS-16293 > URL: https://issues.apache.org/jira/browse/HDFS-16293 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.2.2, 3.3.1, 3.2.3 >Reporter: Yuanxin Zhu >Priority: Major > Attachments: HDFS-16293.01-branch-3.2.2.patch, HDFS-16293.01.patch, > HDFS-16293.02.patch, HDFS-16293.03.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > When I open the ECN and use Terasort(500G data,8 DataNodes,76 vcores/DN) for > testing, DataNodes are congested(HDFS-8008). The client enters the sleep > state after receiving the ACK for many times, but does not release the > 'dataQueue'. The ResponseProcessor thread needs the 'dataQueue' to execute > 'ackQueue.getFirst()', so the ResponseProcessor will wait for the client to > release the 'dataQueue', which is equivalent to that the ResponseProcessor > thread also enters sleep, resulting in ACK delay.MapReduce tasks can be > delayed by tens of minutes or even hours. > The DataStreamer thread can first execute 'one = dataQueue. getFirst()', > release 'dataQueue', and then judge whether to execute 'backOffIfNecessary()' > according to 'one.isHeartbeatPacket()' > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16293) Client sleeps and holds 'dataQueue' when DataNodes are congested
[ https://issues.apache.org/jira/browse/HDFS-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanxin Zhu updated HDFS-16293: --- Attachment: (was: HDFS-16293.03.patch) > Client sleeps and holds 'dataQueue' when DataNodes are congested > > > Key: HDFS-16293 > URL: https://issues.apache.org/jira/browse/HDFS-16293 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.2.2, 3.3.1, 3.2.3 >Reporter: Yuanxin Zhu >Priority: Major > Attachments: HDFS-16293.01-branch-3.2.2.patch, HDFS-16293.01.patch, > HDFS-16293.02.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > When I open the ECN and use Terasort(500G data,8 DataNodes,76 vcores/DN) for > testing, DataNodes are congested(HDFS-8008). The client enters the sleep > state after receiving the ACK for many times, but does not release the > 'dataQueue'. The ResponseProcessor thread needs the 'dataQueue' to execute > 'ackQueue.getFirst()', so the ResponseProcessor will wait for the client to > release the 'dataQueue', which is equivalent to that the ResponseProcessor > thread also enters sleep, resulting in ACK delay.MapReduce tasks can be > delayed by tens of minutes or even hours. > The DataStreamer thread can first execute 'one = dataQueue. getFirst()', > release 'dataQueue', and then judge whether to execute 'backOffIfNecessary()' > according to 'one.isHeartbeatPacket()' > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16354) Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc
[ https://issues.apache.org/jira/browse/HDFS-16354?focusedWorklogId=688403=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688403 ] ASF GitHub Bot logged work on HDFS-16354: - Author: ASF GitHub Bot Created on: 01/Dec/21 03:10 Start Date: 01/Dec/21 03:10 Worklog Time Spent: 10m Work Description: iwasakims opened a new pull request #3740: URL: https://github.com/apache/hadoop/pull/3740 https://issues.apache.org/jira/browse/HDFS-16354 [HDFS-16091](https://issues.apache.org/jira/browse/HDFS-16091) (#3374) added GETSNAPSHOTDIFFLISTING op leveraging ClientProtocol#getSnapshotDiffReportListing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688403) Remaining Estimate: 0h Time Spent: 10m > Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc > > > Key: HDFS-16354 > URL: https://issues.apache.org/jira/browse/HDFS-16354 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HDFS-16091 added GETSNAPSHOTDIFFLISTING op leveraging > ClientProtocol#getSnapshotDiffReportListing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16354) Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc
[ https://issues.apache.org/jira/browse/HDFS-16354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16354: -- Labels: pull-request-available (was: ) > Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc > > > Key: HDFS-16354 > URL: https://issues.apache.org/jira/browse/HDFS-16354 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HDFS-16091 added GETSNAPSHOTDIFFLISTING op leveraging > ClientProtocol#getSnapshotDiffReportListing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16354) Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc
[ https://issues.apache.org/jira/browse/HDFS-16354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16354: Status: Patch Available (was: Open) > Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc > > > Key: HDFS-16354 > URL: https://issues.apache.org/jira/browse/HDFS-16354 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HDFS-16091 added GETSNAPSHOTDIFFLISTING op leveraging > ClientProtocol#getSnapshotDiffReportListing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16359) RBF: RouterRpcServer#invokeAtAvailableNs does not take effect when retrying
[ https://issues.apache.org/jira/browse/HDFS-16359?focusedWorklogId=688404=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688404 ] ASF GitHub Bot logged work on HDFS-16359: - Author: ASF GitHub Bot Created on: 01/Dec/21 03:10 Start Date: 01/Dec/21 03:10 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3731: URL: https://github.com/apache/hadoop/pull/3731#issuecomment-983244140 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 4s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 37m 47s | | trunk passed | | +1 :green_heart: | compile | 0m 50s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 0m 42s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 0m 28s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 47s | | trunk passed | | +1 :green_heart: | javadoc | 0m 48s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 0s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 1m 36s | | trunk passed | | +1 :green_heart: | shadedclient | 27m 5s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 47s | | the patch passed | | +1 :green_heart: | compile | 0m 48s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 0m 48s | | the patch passed | | +1 :green_heart: | compile | 0m 40s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 0m 40s | | the patch passed | | +1 :green_heart: | blanks | 0m 1s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 21s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 51s | | the patch passed | | +1 :green_heart: | javadoc | 0m 42s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 3s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 1m 52s | | the patch passed | | +1 :green_heart: | shadedclient | 29m 40s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 41m 27s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3731/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt) | hadoop-hdfs-rbf in the patch passed. | | +1 :green_heart: | asflicense | 2m 32s | | The patch does not generate ASF License warnings. | | | | 154m 41s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.rbfbalance.TestRouterDistCpProcedure | | | hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3731/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3731 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux cea59fc9bd9b 4.15.0-143-generic #147-Ubuntu SMP Wed Apr 14 16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 1e2216a7c6c7078181e389a11564a24d54744f8b | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3731/6/testReport/ | | Max.
[jira] [Work logged] (HDFS-16361) Fix log format for QueryCommand
[ https://issues.apache.org/jira/browse/HDFS-16361?focusedWorklogId=688382=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688382 ] ASF GitHub Bot logged work on HDFS-16361: - Author: ASF GitHub Bot Created on: 01/Dec/21 00:51 Start Date: 01/Dec/21 00:51 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3732: URL: https://github.com/apache/hadoop/pull/3732#issuecomment-983174151 Hi @ayushtkn , PTAL. Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688382) Time Spent: 1h 40m (was: 1.5h) > Fix log format for QueryCommand > --- > > Key: HDFS-16361 > URL: https://issues.apache.org/jira/browse/HDFS-16361 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: tomscut >Assignee: tomscut >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Fix log format for QueryCommand of disk balancer. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16359) RBF: RouterRpcServer#invokeAtAvailableNs does not take effect when retrying
[ https://issues.apache.org/jira/browse/HDFS-16359?focusedWorklogId=688375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688375 ] ASF GitHub Bot logged work on HDFS-16359: - Author: ASF GitHub Bot Created on: 01/Dec/21 00:34 Start Date: 01/Dec/21 00:34 Worklog Time Spent: 10m Work Description: tomscut commented on a change in pull request #3731: URL: https://github.com/apache/hadoop/pull/3731#discussion_r759766054 ## File path: hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java ## @@ -726,9 +726,10 @@ static String getMethodName() { * @return List of name spaces in the federation on * removing the already invoked namespaceinfo. */ - private Set getNameSpaceInfo(String nsId) { + private Set getNameSpaceInfo( Review comment: Thank you very much for your detailed reply. I have made some modifications, please have a look again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688375) Time Spent: 2h 40m (was: 2.5h) > RBF: RouterRpcServer#invokeAtAvailableNs does not take effect when retrying > --- > > Key: HDFS-16359 > URL: https://issues.apache.org/jira/browse/HDFS-16359 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > RouterRpcServer#invokeAtAvailableNs does not take effect when retrying. See > HDFS-15543. > The original code of RouterRpcServer#getNameSpaceInfo looks like this: > {code:java} > private Set getNameSpaceInfo(String nsId) { > Set namespaceInfos = new HashSet<>(); > for (FederationNamespaceInfo ns : namespaceInfos) { > if (!nsId.equals(ns.getNameserviceId())) { > namespaceInfos.add(ns); > } > } > return namespaceInfos; > } {code} > And _namespaceInfos_ is always empty here. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning
[ https://issues.apache.org/jira/browse/HDFS-16303?focusedWorklogId=688331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688331 ] ASF GitHub Bot logged work on HDFS-16303: - Author: ASF GitHub Bot Created on: 30/Nov/21 22:42 Start Date: 30/Nov/21 22:42 Worklog Time Spent: 10m Work Description: KevinWikant edited a comment on pull request #3675: URL: https://github.com/apache/hadoop/pull/3675#issuecomment-983034760 > DECOMMISSION_IN_PROGRESS + DEAD is an error state that means decommission has effectively failed. There is a case where it can complete, but what does that really mean - if the node is dead, it has not been gracefully stopped. The case which I have described where dead node decommissioning completes can occur when: - a decommissioning node goes dead, but all of its blocks still have block replicas on other live nodes - the namenode is eventually able to satisfy the minimum replication of all blocks (by replicating the under-replicated blocks from the live nodes) - the dead decommissioning node is transitioned to decommissioned In this case, the node did go dead while decommissioning, but there was no LowRedundancy blocks thanks to redundant block replicas. From the user perspective, the loss of the decommissioning node did not impact the outcome of the decommissioning process. Had the node not gone dead while decommissioning, the eventual outcome is the same in that the node is decommissioned & there is no LowRedundancy blocks. If there is LowRedundancy blocks then a dead datanode will remain decommissioning, because if the dead node were to come alive again then it may be able to recover the LowRedundancy blocks. But if there is no LowRedundancy blocks then the when the node comes alive again it will be immediately transition to decommissioned anyway, so why not make it decommissioned while its still dead? Also, I don't think the priority queue is adding much complexity, it's just putting healthy nodes (with more recent heartbeat times) ahead of unhealthy nodes (with older heartbeat times); such that healthy nodes are decommissioned first I also want to call out another caveat with the approach of removing the node from the DatanodeAdminManager which I uncovered while unit testing If we leave the node in DECOMMISSION_IN_PROGRESS & remove the node from DatanodeAdminManager, then the following callstack should re-add the datanode to the DatanodeAdminManager when it comes alive again: - [DatanodeManager.registerDatanode](https://github.com/apache/hadoop/blob/db89a9411ebee11372314e82d7ea0606c348d014/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1223) - [DatanodeManager.startAdminOperationIfNecessary](https://github.com/apache/hadoop/blob/db89a9411ebee11372314e82d7ea0606c348d014/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1109) - [DatanodeAdminManager.startDecommission](https://github.com/apache/hadoop/blob/62c86eaa0e539a4307ca794e0fcd502a77ebceb8/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java#L187) - [DatanodeAdminMonitorBase.startTrackingNode](https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminMonitorBase.java#L136) The problem is this condition "!node.isDecommissionInProgress()": https://github.com/apache/hadoop/blob/62c86eaa0e539a4307ca794e0fcd502a77ebceb8/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java#L177 Because the dead datanode is left in DECOMMISSION_INPROGRESS, "startTrackingNode" is not invoked because of the "!node.isDecommissionInProgress()" condition Simply removing the condition "!node.isDecommissionInProgress()" will not function well because "startTrackingNode" is not idempotent: - [startDecommission is invoked periodically when refreshDatanodes is called](https://github.com/apache/hadoop/blob/db89a9411ebee11372314e82d7ea0606c348d014/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1339) - [pendingNodes is an ArrayDequeue which does not deduplicate the DatanodeDescriptor](https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminMonitorBase.java#L43) - therefore, removing the "!node.isDecommissionInProgress()" check will cause a large number of duplicate DatanodeDescriptor to be added to DatanodeAdminManager I
[jira] [Work logged] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning
[ https://issues.apache.org/jira/browse/HDFS-16303?focusedWorklogId=688306=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688306 ] ASF GitHub Bot logged work on HDFS-16303: - Author: ASF GitHub Bot Created on: 30/Nov/21 21:30 Start Date: 30/Nov/21 21:30 Worklog Time Spent: 10m Work Description: KevinWikant edited a comment on pull request #3675: URL: https://github.com/apache/hadoop/pull/3675#issuecomment-983034760 > DECOMMISSION_IN_PROGRESS + DEAD is an error state that means decommission has effectively failed. There is a case where it can complete, but what does that really mean - if the node is dead, it has not been gracefully stopped. The case which I have described where dead node decommissioning completes can occur when: - a decommissioning node goes dead, but all of its blocks still have block replicas on other live nodes - the namenode is eventually able to satisfy the minimum replication of all blocks (by replicating the under-replicated blocks from the live nodes) - the dead decommissioning node is transitioned to decommissioned In this case, the node did go dead while decommissioning, but there was no LowRedundancy blocks thanks to redundant block replicas. From the user perspective, the loss of the decommissioning node did not impact the outcome of the decommissioning process. Had the node not gone dead while decommissioning, the eventual outcome is the same in that the node is decommissioned & there is no LowRedundancy blocks. If there is LowRedundancy blocks then a dead datanode will remain decommissioning, because if the dead node were to come alive again then it may be able to recover the LowRedundancy blocks. But if there is no LowRedundancy blocks then the when the node comes alive again it will be immediately transition to decommissioned anyway, so why not make it decommissioned while its still dead? Also, I don't think the priority queue is adding much complexity, it's just putting healthy nodes (with more recent heartbeat times) ahead of unhealthy nodes (with older heartbeat times); such that healthy nodes are decommissioned first I also want to call out another caveat with the approach of removing the node from the DatanodeAdminManager which I uncovered while unit testing If we leave the node in DECOMMISSION_IN_PROGRESS & remove the node from DatanodeAdminManager, then the following callstack should re-add the datanode to the DatanodeAdminManager when it comes alive again: - [DatanodeManager.registerDatanode](https://github.com/apache/hadoop/blob/db89a9411ebee11372314e82d7ea0606c348d014/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1223) - [DatanodeManager.startAdminOperationIfNecessary](https://github.com/apache/hadoop/blob/db89a9411ebee11372314e82d7ea0606c348d014/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1109) - [DatanodeAdminManager.startDecommission](https://github.com/apache/hadoop/blob/62c86eaa0e539a4307ca794e0fcd502a77ebceb8/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java#L187) - [DatanodeAdminMonitorBase.startTrackingNode](https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminMonitorBase.java#L136) The problem is this condition "!node.isDecommissionInProgress()": https://github.com/apache/hadoop/blob/62c86eaa0e539a4307ca794e0fcd502a77ebceb8/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java#L177 Because the dead datanode is left in DECOMMISSION_INPROGRESS, "startTrackingNode" is not invoked because of the "!node.isDecommissionInProgress()" condition Simply removing the condition "!node.isDecommissionInProgress()" will not function well because "startTrackingNode" is not idempotent: - [startDecommission is invoked periodically when refreshDatanodes is called](https://github.com/apache/hadoop/blob/db89a9411ebee11372314e82d7ea0606c348d014/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1339) - [pendingNodes is an ArrayDequeue which does not deduplicate the DatanodeDescriptor](https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminMonitorBase.java#L43) - therefore, removing the "!node.isDecommissionInProgress()" check will cause a large number of duplicate DatanodeDescriptor to be added to DatanodeAdminManager I
[jira] [Work logged] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning
[ https://issues.apache.org/jira/browse/HDFS-16303?focusedWorklogId=688303=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688303 ] ASF GitHub Bot logged work on HDFS-16303: - Author: ASF GitHub Bot Created on: 30/Nov/21 21:26 Start Date: 30/Nov/21 21:26 Worklog Time Spent: 10m Work Description: KevinWikant commented on pull request #3675: URL: https://github.com/apache/hadoop/pull/3675#issuecomment-983034760 > DECOMMISSION_IN_PROGRESS + DEAD is an error state that means decommission has effectively failed. There is a case where it can complete, but what does that really mean - if the node is dead, it has not been gracefully stopped. The case which I have described where dead node decommissioning completes can occur when: - a decommissioning node goes dead, but all of its blocks still have block replicas on other live nodes - the namenode is eventually able to satisfy the minimum replication of all blocks (by replicating the under-replicated blocks from the live nodes) - the dead decommissioning node is transitioned to decommissioned In this case, the node did go dead while decommissioning, but there was no data loss thanks to redundant block replicas. From the user perspective, the loss of the decommissioning node did not impact the outcome of the decommissioning process. Had the node not gone dead while decommissioning, the eventual outcome is the same in that the node is decommissioned, there is no data loss, & all blocks have sufficient replicas. If there is data loss then a dead datanode will remain decommissioning, because if the dead node were to come alive again then it may be able to recover the lost data. But if there is no data loss then the when the node comes alive again it will be immediately transition to decommissioned anyway, so why not make it decommissioned while its still dead (and there is no data loss)? Also, I don't think the priority queue is adding much complexity, it's just putting healthy nodes (with more recent heartbeat times) ahead of unhealthy nodes (with older heartbeat times) such that healthy nodes are decommissioned first I also want to call out another caveat with the approach of removing the node from the DatanodeAdminManager which I uncovered while unit testing If we leave the node in DECOMMISSION_IN_PROGRESS & remove the node from DatanodeAdminManager, then the following callstack should re-add the datanode to the DatanodeAdminManager when it comes alive again: - [DatanodeManager.registerDatanode](https://github.com/apache/hadoop/blob/db89a9411ebee11372314e82d7ea0606c348d014/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1223) - [DatanodeManager.startAdminOperationIfNecessary](https://github.com/apache/hadoop/blob/db89a9411ebee11372314e82d7ea0606c348d014/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1109) - [DatanodeAdminManager.startDecommission](https://github.com/apache/hadoop/blob/62c86eaa0e539a4307ca794e0fcd502a77ebceb8/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java#L187) - [DatanodeAdminMonitorBase.startTrackingNode](https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminMonitorBase.java#L136) The problem is this condition "!node.isDecommissionInProgress()": https://github.com/apache/hadoop/blob/62c86eaa0e539a4307ca794e0fcd502a77ebceb8/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java#L177 Because the dead datanode is left in DECOMMISSION_INPROGRESS, "startTrackingNode" is not invoked because of the "!node.isDecommissionInProgress()" condition Simply removing the condition "!node.isDecommissionInProgress()" will not function well because "startTrackingNode" is not idempotent: - [startDecommission is invoked periodically when refreshDatanodes is called](https://github.com/apache/hadoop/blob/db89a9411ebee11372314e82d7ea0606c348d014/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L1339) - [pendingNodes is an ArrayDequeue which does not deduplicate the DatanodeDescriptor](https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminMonitorBase.java#L43) - therefore, removing the "!node.isDecommissionInProgress()" check will cause a large number of duplicate DatanodeDescriptor to be added to DatanodeAdminManager I
[jira] [Commented] (HDFS-16293) Client sleeps and holds 'dataQueue' when DataNodes are congested
[ https://issues.apache.org/jira/browse/HDFS-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451329#comment-17451329 ] Hadoop QA commented on HDFS-16293: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 56s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 12m 38s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 30s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 18s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 38s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 27m 2s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 37m 33s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 6m 32s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 26s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 47s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 5m 47s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/742/artifact/out/diff-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} hadoop-hdfs-project-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 5 new + 646 unchanged - 0 fixed = 651 total (was 646) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 33s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 5m 33s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/742/artifact/out/diff-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} hadoop-hdfs-project-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 5 new + 624 unchanged - 0 fixed = 629 total (was 624) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s{color} |
[jira] [Work logged] (HDFS-16359) RBF: RouterRpcServer#invokeAtAvailableNs does not take effect when retrying
[ https://issues.apache.org/jira/browse/HDFS-16359?focusedWorklogId=688204=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688204 ] ASF GitHub Bot logged work on HDFS-16359: - Author: ASF GitHub Bot Created on: 30/Nov/21 18:00 Start Date: 30/Nov/21 18:00 Worklog Time Spent: 10m Work Description: goiri commented on a change in pull request #3731: URL: https://github.com/apache/hadoop/pull/3731#discussion_r759531237 ## File path: hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java ## @@ -726,9 +726,10 @@ static String getMethodName() { * @return List of name spaces in the federation on * removing the already invoked namespaceinfo. */ - private Set getNameSpaceInfo(String nsId) { + private Set getNameSpaceInfo( Review comment: BTW, if we are changing the signature of the method, I would also add final to the arguments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688204) Time Spent: 2.5h (was: 2h 20m) > RBF: RouterRpcServer#invokeAtAvailableNs does not take effect when retrying > --- > > Key: HDFS-16359 > URL: https://issues.apache.org/jira/browse/HDFS-16359 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > RouterRpcServer#invokeAtAvailableNs does not take effect when retrying. See > HDFS-15543. > The original code of RouterRpcServer#getNameSpaceInfo looks like this: > {code:java} > private Set getNameSpaceInfo(String nsId) { > Set namespaceInfos = new HashSet<>(); > for (FederationNamespaceInfo ns : namespaceInfos) { > if (!nsId.equals(ns.getNameserviceId())) { > namespaceInfos.add(ns); > } > } > return namespaceInfos; > } {code} > And _namespaceInfos_ is always empty here. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16359) RBF: RouterRpcServer#invokeAtAvailableNs does not take effect when retrying
[ https://issues.apache.org/jira/browse/HDFS-16359?focusedWorklogId=688203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688203 ] ASF GitHub Bot logged work on HDFS-16359: - Author: ASF GitHub Bot Created on: 30/Nov/21 17:59 Start Date: 30/Nov/21 17:59 Worklog Time Spent: 10m Work Description: goiri commented on a change in pull request #3731: URL: https://github.com/apache/hadoop/pull/3731#discussion_r759530486 ## File path: hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java ## @@ -726,9 +726,10 @@ static String getMethodName() { * @return List of name spaces in the federation on * removing the already invoked namespaceinfo. */ - private Set getNameSpaceInfo(String nsId) { + private Set getNameSpaceInfo( Review comment: I don't think this is too important to be honest. There might be some minor optimization by being static but I do not think it is that important. The main reason to make it static would be to make clear that this method is not doing anything with the attributes of the object. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688203) Time Spent: 2h 20m (was: 2h 10m) > RBF: RouterRpcServer#invokeAtAvailableNs does not take effect when retrying > --- > > Key: HDFS-16359 > URL: https://issues.apache.org/jira/browse/HDFS-16359 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > RouterRpcServer#invokeAtAvailableNs does not take effect when retrying. See > HDFS-15543. > The original code of RouterRpcServer#getNameSpaceInfo looks like this: > {code:java} > private Set getNameSpaceInfo(String nsId) { > Set namespaceInfos = new HashSet<>(); > for (FederationNamespaceInfo ns : namespaceInfos) { > if (!nsId.equals(ns.getNameserviceId())) { > namespaceInfos.add(ns); > } > } > return namespaceInfos; > } {code} > And _namespaceInfos_ is always empty here. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16331) Make dfs.blockreport.intervalMsec reconfigurable
[ https://issues.apache.org/jira/browse/HDFS-16331?focusedWorklogId=688039=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-688039 ] ASF GitHub Bot logged work on HDFS-16331: - Author: ASF GitHub Bot Created on: 30/Nov/21 14:09 Start Date: 30/Nov/21 14:09 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3676: URL: https://github.com/apache/hadoop/pull/3676#issuecomment-982670912 > @tomscut Thanks for updating it. LGTM. > > I will merge this PR later this week if there are no other concerns from other reviewers. Thanks @tasanuma for your careful review and many suggestions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 688039) Time Spent: 4h 40m (was: 4.5h) > Make dfs.blockreport.intervalMsec reconfigurable > > > Key: HDFS-16331 > URL: https://issues.apache.org/jira/browse/HDFS-16331 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Attachments: image-2021-11-18-09-33-24-236.png, > image-2021-11-18-09-35-35-400.png > > Time Spent: 4h 40m > Remaining Estimate: 0h > > We have a cold data cluster, which stores as EC policy. There are 24 fast > disks on each node and each disk is 7 TB. > Recently, many nodes have more than 10 million blocks, and the interval of > FBR is 6h as default. Frequent FBR caused great pressure on NN. > !image-2021-11-18-09-35-35-400.png|width=334,height=229! > !image-2021-11-18-09-33-24-236.png|width=566,height=159! > We want to increase the interval of FBR, but have to rolling restart the DNs, > this operation is very heavy. In this scenario, it is necessary to make > _dfs.blockreport.intervalMsec_ reconfigurable. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16293) Client sleeps and holds 'dataQueue' when DataNodes are congested
[ https://issues.apache.org/jira/browse/HDFS-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanxin Zhu updated HDFS-16293: --- Attachment: HDFS-16293.03.patch > Client sleeps and holds 'dataQueue' when DataNodes are congested > > > Key: HDFS-16293 > URL: https://issues.apache.org/jira/browse/HDFS-16293 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.2.2, 3.3.1, 3.2.3 >Reporter: Yuanxin Zhu >Priority: Major > Attachments: HDFS-16293.01-branch-3.2.2.patch, HDFS-16293.01.patch, > HDFS-16293.02.patch, HDFS-16293.03.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > When I open the ECN and use Terasort(500G data,8 DataNodes,76 vcores/DN) for > testing, DataNodes are congested(HDFS-8008). The client enters the sleep > state after receiving the ACK for many times, but does not release the > 'dataQueue'. The ResponseProcessor thread needs the 'dataQueue' to execute > 'ackQueue.getFirst()', so the ResponseProcessor will wait for the client to > release the 'dataQueue', which is equivalent to that the ResponseProcessor > thread also enters sleep, resulting in ACK delay.MapReduce tasks can be > delayed by tens of minutes or even hours. > The DataStreamer thread can first execute 'one = dataQueue. getFirst()', > release 'dataQueue', and then judge whether to execute 'backOffIfNecessary()' > according to 'one.isHeartbeatPacket()' > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16362) [FSO] Refactor isFileSystemOptimized usage in OzoneManagerUtils
Rakesh Radhakrishnan created HDFS-16362: --- Summary: [FSO] Refactor isFileSystemOptimized usage in OzoneManagerUtils Key: HDFS-16362 URL: https://issues.apache.org/jira/browse/HDFS-16362 Project: Hadoop HDFS Issue Type: Bug Reporter: Rakesh Radhakrishnan This task is to refactor the om request instantiation based on #isFileSystemOptimized() check in OzoneManagerUtils class. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16331) Make dfs.blockreport.intervalMsec reconfigurable
[ https://issues.apache.org/jira/browse/HDFS-16331?focusedWorklogId=687989=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-687989 ] ASF GitHub Bot logged work on HDFS-16331: - Author: ASF GitHub Bot Created on: 30/Nov/21 12:59 Start Date: 30/Nov/21 12:59 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3676: URL: https://github.com/apache/hadoop/pull/3676#issuecomment-982612629 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 43s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 1s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 40s | | trunk passed | | +1 :green_heart: | compile | 1m 25s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 20s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 1s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 32s | | trunk passed | | +1 :green_heart: | javadoc | 1m 0s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 27s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 12s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 24s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 16s | | the patch passed | | +1 :green_heart: | compile | 1m 17s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 17s | | the patch passed | | +1 :green_heart: | compile | 1m 12s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 12s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 53s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3676/11/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 134 unchanged - 55 fixed = 135 total (was 189) | | +1 :green_heart: | mvnsite | 1m 19s | | the patch passed | | +1 :green_heart: | javadoc | 0m 51s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 24s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 13s | | the patch passed | | +1 :green_heart: | shadedclient | 21m 56s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 224m 16s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 45s | | The patch does not generate ASF License warnings. | | | | 322m 48s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3676/11/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3676 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 0f6ffc171b75 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 52f2a4840682a5dc34221dec636f4a50450f5988 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3676/11/testReport/ | | Max. process+thread count | 3266 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U:
[jira] [Work logged] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning
[ https://issues.apache.org/jira/browse/HDFS-16303?focusedWorklogId=687979=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-687979 ] ASF GitHub Bot logged work on HDFS-16303: - Author: ASF GitHub Bot Created on: 30/Nov/21 12:33 Start Date: 30/Nov/21 12:33 Worklog Time Spent: 10m Work Description: virajjasani commented on pull request #3675: URL: https://github.com/apache/hadoop/pull/3675#issuecomment-982594432 > the priority queue idea adds some more complexity to an already hard to follow process / code area and I wonder if it is better to just remove the node from the monitor and let it be dealt with manually, which may be required a lot of the time anyway? Having faced the similar situation of `DECOMMISSION_IN_PROGRESS + DEAD` state requiring manual intervention, I agree with approach of removing the node and also the fact that it's already too complex process to follow so introducing new priority queue would complicate it even further. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 687979) Time Spent: 4.5h (was: 4h 20m) > Losing over 100 datanodes in state decommissioning results in full blockage > of all datanode decommissioning > --- > > Key: HDFS-16303 > URL: https://issues.apache.org/jira/browse/HDFS-16303 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.10.1, 3.3.1 >Reporter: Kevin Wikant >Priority: Major > Labels: pull-request-available > Time Spent: 4.5h > Remaining Estimate: 0h > > h2. Impact > HDFS datanode decommissioning does not make any forward progress. For > example, the user adds X datanodes to the "dfs.hosts.exclude" file and all X > of those datanodes remain in state decommissioning forever without making any > forward progress towards being decommissioned. > h2. Root Cause > The HDFS Namenode class "DatanodeAdminManager" is responsible for > decommissioning datanodes. > As per this "hdfs-site" configuration: > {quote}Config = dfs.namenode.decommission.max.concurrent.tracked.nodes > Default Value = 100 > The maximum number of decommission-in-progress datanodes nodes that will be > tracked at one time by the namenode. Tracking a decommission-in-progress > datanode consumes additional NN memory proportional to the number of blocks > on the datnode. Having a conservative limit reduces the potential impact of > decomissioning a large number of nodes at once. A value of 0 means no limit > will be enforced. > {quote} > The Namenode will only actively track up to 100 datanodes for decommissioning > at any given time, as to avoid Namenode memory pressure. > Looking into the "DatanodeAdminManager" code: > * a new datanode is only removed from the "tracked.nodes" set when it > finishes decommissioning > * a new datanode is only added to the "tracked.nodes" set if there is fewer > than 100 datanodes being tracked > So in the event that there are more than 100 datanodes being decommissioned > at a given time, some of those datanodes will not be in the "tracked.nodes" > set until 1 or more datanodes in the "tracked.nodes" finishes > decommissioning. This is generally not a problem because the datanodes in > "tracked.nodes" will eventually finish decommissioning, but there is an edge > case where this logic prevents the namenode from making any forward progress > towards decommissioning. > If all 100 datanodes in the "tracked.nodes" are unable to finish > decommissioning, then other datanodes (which may be able to be > decommissioned) will never get added to "tracked.nodes" and therefore will > never get the opportunity to be decommissioned. > This can occur due the following issue: > {quote}2021-10-21 12:39:24,048 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager > (DatanodeAdminMonitor-0): Node W.X.Y.Z:50010 is dead while in Decommission In > Progress. Cannot be safely decommissioned or be in maintenance since there is > risk of reduced data durability or data loss. Either restart the failed node > or force decommissioning or maintenance by removing, calling refreshNodes, > then re-adding to the excludes or host config files. > {quote} > If a Datanode is lost while decommissioning (for example if the underlying > hardware fails or is lost), then it will remain in state decommissioning > forever. > If 100 or more Datanodes are lost while
[jira] [Commented] (HDFS-16317) Backport HDFS-14729 for branch-3.2
[ https://issues.apache.org/jira/browse/HDFS-16317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451099#comment-17451099 ] Brahma Reddy Battula commented on HDFS-16317: - [~ananysin] thanks for contribution.. Committed to branch-3.2.. Can you upload patch for branch-3.2.3 also..? > Backport HDFS-14729 for branch-3.2 > -- > > Key: HDFS-16317 > URL: https://issues.apache.org/jira/browse/HDFS-16317 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 3.2.2 >Reporter: Ananya Singh >Assignee: Ananya Singh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Our security tool raised the following security flaw on Hadoop 3.2.2: > +[CVE-2015-9251 : > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-9251] > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2015-9251|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-9251]+ > +[CVE-2019-11358|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2019-11358] > : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2019-11358|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2019-11358]+ > +[CVE-2020-11022 > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022] : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2020-11022|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022]+ > > +[CVE-2020-11023 > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11023] [ > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022] : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2020-11023|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11023]+ > > > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16317) Backport HDFS-14729 for branch-3.2
[ https://issues.apache.org/jira/browse/HDFS-16317?focusedWorklogId=687978=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-687978 ] ASF GitHub Bot logged work on HDFS-16317: - Author: ASF GitHub Bot Created on: 30/Nov/21 12:32 Start Date: 30/Nov/21 12:32 Worklog Time Spent: 10m Work Description: brahmareddybattula merged pull request #3692: URL: https://github.com/apache/hadoop/pull/3692 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 687978) Time Spent: 1h 50m (was: 1h 40m) > Backport HDFS-14729 for branch-3.2 > -- > > Key: HDFS-16317 > URL: https://issues.apache.org/jira/browse/HDFS-16317 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 3.2.2 >Reporter: Ananya Singh >Assignee: Ananya Singh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Our security tool raised the following security flaw on Hadoop 3.2.2: > +[CVE-2015-9251 : > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-9251] > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2015-9251|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-9251]+ > +[CVE-2019-11358|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2019-11358] > : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2019-11358|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2019-11358]+ > +[CVE-2020-11022 > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022] : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2020-11022|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022]+ > > +[CVE-2020-11023 > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11023] [ > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022] : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2020-11023|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11023]+ > > > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16317) Backport HDFS-14729 for branch-3.2
[ https://issues.apache.org/jira/browse/HDFS-16317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ananya Singh reassigned HDFS-16317: --- Assignee: Ananya Singh > Backport HDFS-14729 for branch-3.2 > -- > > Key: HDFS-16317 > URL: https://issues.apache.org/jira/browse/HDFS-16317 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 3.2.2 >Reporter: Ananya Singh >Assignee: Ananya Singh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Our security tool raised the following security flaw on Hadoop 3.2.2: > +[CVE-2015-9251 : > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-9251] > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2015-9251|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-9251]+ > +[CVE-2019-11358|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2019-11358] > : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2019-11358|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2019-11358]+ > +[CVE-2020-11022 > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022] : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2020-11022|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022]+ > > +[CVE-2020-11023 > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11023] [ > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022] : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2020-11023|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11023]+ > > > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16317) Backport HDFS-14729 for branch-3.2
[ https://issues.apache.org/jira/browse/HDFS-16317?focusedWorklogId=687975=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-687975 ] ASF GitHub Bot logged work on HDFS-16317: - Author: ASF GitHub Bot Created on: 30/Nov/21 12:26 Start Date: 30/Nov/21 12:26 Worklog Time Spent: 10m Work Description: brahmareddybattula commented on pull request #3692: URL: https://github.com/apache/hadoop/pull/3692#issuecomment-982588459 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 687975) Time Spent: 1h 40m (was: 1.5h) > Backport HDFS-14729 for branch-3.2 > -- > > Key: HDFS-16317 > URL: https://issues.apache.org/jira/browse/HDFS-16317 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 3.2.2 >Reporter: Ananya Singh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Our security tool raised the following security flaw on Hadoop 3.2.2: > +[CVE-2015-9251 : > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-9251] > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2015-9251|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-9251]+ > +[CVE-2019-11358|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2019-11358] > : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2019-11358|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2019-11358]+ > +[CVE-2020-11022 > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022] : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2020-11022|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022]+ > > +[CVE-2020-11023 > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11023] [ > |http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11022] : > [https://nvd.nist.gov/vuln/detail/|https://nvd.nist.gov/vuln/detail/CVE-2021-29425] > > [CVE-2020-11023|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2020-11023]+ > > > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16333) fix balancer bug when transfer an EC block
[ https://issues.apache.org/jira/browse/HDFS-16333?focusedWorklogId=687958=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-687958 ] ASF GitHub Bot logged work on HDFS-16333: - Author: ASF GitHub Bot Created on: 30/Nov/21 11:49 Start Date: 30/Nov/21 11:49 Worklog Time Spent: 10m Work Description: liubingxing commented on pull request #3679: URL: https://github.com/apache/hadoop/pull/3679#issuecomment-982562249 @hemanthboyina Please take a look at this and give some advice. Thanks a lot -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 687958) Time Spent: 1h 20m (was: 1h 10m) > fix balancer bug when transfer an EC block > -- > > Key: HDFS-16333 > URL: https://issues.apache.org/jira/browse/HDFS-16333 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Reporter: qinyuren >Assignee: qinyuren >Priority: Major > Labels: pull-request-available > Attachments: image-2021-11-18-17-25-13-089.png, > image-2021-11-18-17-25-50-556.png, image-2021-11-18-17-28-03-155.png > > Time Spent: 1h 20m > Remaining Estimate: 0h > > We set the EC policy to (6+3) and we also have nodes that were > decommissioning when we executed balancer. > With the balancer running, we find many error logs as follow. > !image-2021-11-18-17-25-13-089.png|width=858,height=135! > Node A wants to transfer an EC block to node B, but we found that the block > is not on node A. The FSCK command to show the block status as follow > !image-2021-11-18-17-25-50-556.png|width=607,height=189! > In the dispatcher. getBlockList function > !image-2021-11-18-17-28-03-155.png! > > Assume that the location of the an EC block in storageGroupMap look like this > indices:[0, 1, 2, 3, 4, 5, 6, 7, 8] > node:[a, b, c, d, e, f, g, h, i] > after decommission operation, the internal block on indices[1] were > decommission to another node. > indices:[0, 1, 2, 3, 4, 5, 6, 7, 8] > node:[a, {color:#FF}j{color}, c, d, e, f, g, h, i] > the location of indices[1] change from node {color:#FF}b{color} to node > {color:#FF}j{color}. > > When the balancer get the block location and check it with the location in > storageGroupMap. > If a node is not found in storageGroupMap, it will not be add to block > locations. > In this case, node {color:#FF}j {color}will not be added to the block > locations, while the indices is not updated. > Finally, the block location may look like this, > indices:[0, 1, 2, 3, 4, 5, 6, 7, 8] > {color:#FF}block.location:[a, c, d, e, f, g, h, i]{color} > the location of the nodes does not match their indices > > Solution: > we should update the indices and match with the nodes > {color:#FF}indices:[0, 2, 3, 4, 5, 6, 7, 8]{color} > {color:#FF}block.location:[a, c, d, e, f, g, h, i]{color} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451069#comment-17451069 ] Rakesh Radhakrishnan commented on HDFS-16014: - [~PhiloHe] can you please re-run the build by attaching a new patch and get latest QA report. Thanks! +1 patch looks good to me > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16324) fix error log in BlockManagerSafeMode
[ https://issues.apache.org/jira/browse/HDFS-16324?focusedWorklogId=687868=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-687868 ] ASF GitHub Bot logged work on HDFS-16324: - Author: ASF GitHub Bot Created on: 30/Nov/21 09:17 Start Date: 30/Nov/21 09:17 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3661: URL: https://github.com/apache/hadoop/pull/3661#issuecomment-982437038 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 37s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 3 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 19m 49s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 25m 29s | | trunk passed | | +1 :green_heart: | compile | 5m 24s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 5m 16s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 13s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 30s | | trunk passed | | +1 :green_heart: | javadoc | 1m 47s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 2m 10s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 6m 37s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 14s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 15s | | the patch passed | | +1 :green_heart: | compile | 5m 46s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 5m 46s | | the patch passed | | +1 :green_heart: | compile | 4m 58s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 4m 58s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 1m 3s | [/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3661/7/artifact/out/results-checkstyle-hadoop-hdfs-project.txt) | hadoop-hdfs-project: The patch generated 133 new + 101 unchanged - 0 fixed = 234 total (was 101) | | +1 :green_heart: | mvnsite | 2m 3s | | the patch passed | | +1 :green_heart: | javadoc | 1m 25s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 59s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 5m 56s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 25s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 26s | | hadoop-hdfs-client in the patch passed. | | +1 :green_heart: | unit | 228m 10s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 47s | | The patch does not generate ASF License warnings. | | | | 370m 29s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3661/7/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3661 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 0df3d7815963 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / ad6bebf3f2624846cac32dd4c04d479e73e75761 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |