[jira] [Work logged] (HDFS-14529) NPE while Loading the Editlogs
[ https://issues.apache.org/jira/browse/HDFS-14529?focusedWorklogId=628885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628885 ] ASF GitHub Bot logged work on HDFS-14529: - Author: ASF GitHub Bot Created on: 28/Jul/21 05:56 Start Date: 28/Jul/21 05:56 Worklog Time Spent: 10m Work Description: virajjasani commented on a change in pull request #3243: URL: https://github.com/apache/hadoop/pull/3243#discussion_r677992894 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirAttrOp.java ## @@ -497,10 +492,14 @@ private static void setDirStoragePolicy( static boolean unprotectedSetTimes( FSDirectory fsd, INodesInPath iip, long mtime, long atime, boolean force) - throws QuotaExceededException { + throws QuotaExceededException, FileNotFoundException { Review comment: nit: looks like throwing `QuotaExceededException` is redundant here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628885) Time Spent: 20m (was: 10m) > NPE while Loading the Editlogs > -- > > Key: HDFS-14529 > URL: https://issues.apache.org/jira/browse/HDFS-14529 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Wei-Chiu Chuang >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {noformat} > 2019-05-31 15:15:42,397 ERROR namenode.FSEditLogLoader: Encountered exception > on operation TimesOp [length=0, > path=/testLoadSpace/dir0/dir0/dir0/dir2/_file_9096763, mtime=-1, > atime=1559294343288, opCode=OP_TIMES, txid=18927893] > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:490) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:711) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:286) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:181) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:924) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:771) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1558) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1640) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1725){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack
[ https://issues.apache.org/jira/browse/HDFS-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-16138 started by Renukaprasad C. - > BlockReportProcessingThread exit doesnt print the acutal stack > -- > > Key: HDFS-16138 > URL: https://issues.apache.org/jira/browse/HDFS-16138 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > BlockReportProcessingThread thread may gets exited with multiple reasons, but > the current logging prints only the exception message with different stack > which is difficult to debug the issue. > > Existing logging: > 2021-07-20 10:20:23,104 [Block report processor] INFO util.ExitUtil > (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report > processor encountered fatal exception: java.lang.AssertionError > 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil > (ExitUtil.java:terminate(213)) - Terminate called > 1: Block report processor encountered fatal exception: > java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > Exception in thread "Block report processor" 1: Block report processor > encountered fatal exception: java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > > Actual issue found at: > 2021-07-20 10:20:23,101 [Block report processor] ERROR > blockmanagement.BlockManager (BlockManager.java:run(5314)) - > java.lang.AssertionError > java.lang.AssertionError > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312) > > This issue found while working on FGL branch. But, same issue can happen in > Trunk also in any error scenario. > > [~hemanthboyina] [~hexiaoqiao] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack
[ https://issues.apache.org/jira/browse/HDFS-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388439#comment-17388439 ] Renukaprasad C commented on HDFS-16138: --- org.apache.hadoop.util.ExitUtil#terminate(int, java.lang.String) This create new exception, which include the Exception message but miss the actual stack trace. Now, adding full stack. Logging continue as before based on other parameters. [~hexiaoqiao] [~hemanthboyina] can you please take a look whenever you get time? > BlockReportProcessingThread exit doesnt print the acutal stack > -- > > Key: HDFS-16138 > URL: https://issues.apache.org/jira/browse/HDFS-16138 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > BlockReportProcessingThread thread may gets exited with multiple reasons, but > the current logging prints only the exception message with different stack > which is difficult to debug the issue. > > Existing logging: > 2021-07-20 10:20:23,104 [Block report processor] INFO util.ExitUtil > (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report > processor encountered fatal exception: java.lang.AssertionError > 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil > (ExitUtil.java:terminate(213)) - Terminate called > 1: Block report processor encountered fatal exception: > java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > Exception in thread "Block report processor" 1: Block report processor > encountered fatal exception: java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > > Actual issue found at: > 2021-07-20 10:20:23,101 [Block report processor] ERROR > blockmanagement.BlockManager (BlockManager.java:run(5314)) - > java.lang.AssertionError > java.lang.AssertionError > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312) > > This issue found while working on FGL branch. But, same issue can happen in > Trunk also in any error scenario. > > [~hemanthboyina] [~hexiaoqiao] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff
[ https://issues.apache.org/jira/browse/HDFS-16145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDFS-16145. -- Fix Version/s: 3.3.2 Resolution: Fixed > CopyListing fails with FNF exception with snapshot diff > --- > > Key: HDFS-16145 > URL: https://issues.apache.org/jira/browse/HDFS-16145 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 3.3.2 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Distcp with snapshotdiff and with filters, marks a Rename as a delete > opeartion on the target if the rename target is to a directory which is > exluded by the filter. But, in cases, where files/subdirs created/modified > prior to the Rename post the old snapshot will still be present as > modified/created entries in the final copy list. Since, the parent diretory > is marked for deletion, these subsequent create/modify entries should be > ignored while building the final copy list. > With such cases, when the final copy list is built, distcp tries to do a > lookup for each create/modified file in the newer snapshot which will fail > as, the parent dir is already moved to a new location in later snapshot. > > {code:java} > sudo -u kms hadoop key create testkey > hadoop fs -mkdir -p /data/gcgdlknnasg/ > hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/ > hadoop fs -mkdir -p /dest/gcgdlknnasg > hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg > hdfs dfs -mkdir /data/gcgdlknnasg/dir1 > hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ > hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/ > drwxrwxrwt - hdfs supergroup 0 2021-07-16 14:05 > /data/gcgdlknnasg/.Trash > drwxr-xr-x - hdfs supergroup 0 2021-07-16 13:07 > /data/gcgdlknnasg/dir1 > [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# > hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/ > hdfs dfs -rm -r /data/gcgdlknnasg/dir1/ > hdfs dfs -mkdir /data/gcgdlknnasg/dir1/ > ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the > replication schedule. You get into below error and failure of the BDR job. > 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - > java.io.FileNotFoundException: File does not exist: > /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487) > …….. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack
[ https://issues.apache.org/jira/browse/HDFS-16138?focusedWorklogId=628870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628870 ] ASF GitHub Bot logged work on HDFS-16138: - Author: ASF GitHub Bot Created on: 28/Jul/21 05:01 Start Date: 28/Jul/21 05:01 Worklog Time Spent: 10m Work Description: jojochuang commented on a change in pull request #3244: URL: https://github.com/apache/hadoop/pull/3244#discussion_r677973915 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java ## @@ -5190,7 +5191,7 @@ public void run() { processQueue(); } catch (Throwable t) { ExitUtil.terminate(1, -getName() + " encountered fatal exception: " + t); +getName() + " encountered fatal exception: " + ExceptionUtils.getStackTrace(t)); Review comment: looking at other usage in the Hadoop codebase, it seems to me a better way to print stack trace is to use the logger before exiting through ExitUtil.terminate(). https://github.com/apache/hadoop/blob/d0dcfc405c624f73ed1af9527bbf456a10337a6d/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ServiceMaster.java#L353 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628870) Time Spent: 0.5h (was: 20m) > BlockReportProcessingThread exit doesnt print the acutal stack > -- > > Key: HDFS-16138 > URL: https://issues.apache.org/jira/browse/HDFS-16138 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > BlockReportProcessingThread thread may gets exited with multiple reasons, but > the current logging prints only the exception message with different stack > which is difficult to debug the issue. > > Existing logging: > 2021-07-20 10:20:23,104 [Block report processor] INFO util.ExitUtil > (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report > processor encountered fatal exception: java.lang.AssertionError > 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil > (ExitUtil.java:terminate(213)) - Terminate called > 1: Block report processor encountered fatal exception: > java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > Exception in thread "Block report processor" 1: Block report processor > encountered fatal exception: java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > > Actual issue found at: > 2021-07-20 10:20:23,101 [Block report processor] ERROR > blockmanagement.BlockManager (BlockManager.java:run(5314)) - > java.lang.AssertionError > java.lang.AssertionError > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312) > > This
[jira] [Work logged] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack
[ https://issues.apache.org/jira/browse/HDFS-16138?focusedWorklogId=628869=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628869 ] ASF GitHub Bot logged work on HDFS-16138: - Author: ASF GitHub Bot Created on: 28/Jul/21 05:00 Start Date: 28/Jul/21 05:00 Worklog Time Spent: 10m Work Description: jojochuang commented on a change in pull request #3244: URL: https://github.com/apache/hadoop/pull/3244#discussion_r677973915 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java ## @@ -5190,7 +5191,7 @@ public void run() { processQueue(); } catch (Throwable t) { ExitUtil.terminate(1, -getName() + " encountered fatal exception: " + t); +getName() + " encountered fatal exception: " + ExceptionUtils.getStackTrace(t)); Review comment: looking at other usage in the Hadoop codebase, it seems to me a better way to print stack trace is to use the logger. https://github.com/apache/hadoop/blob/d0dcfc405c624f73ed1af9527bbf456a10337a6d/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ServiceMaster.java#L353 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628869) Time Spent: 20m (was: 10m) > BlockReportProcessingThread exit doesnt print the acutal stack > -- > > Key: HDFS-16138 > URL: https://issues.apache.org/jira/browse/HDFS-16138 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > BlockReportProcessingThread thread may gets exited with multiple reasons, but > the current logging prints only the exception message with different stack > which is difficult to debug the issue. > > Existing logging: > 2021-07-20 10:20:23,104 [Block report processor] INFO util.ExitUtil > (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report > processor encountered fatal exception: java.lang.AssertionError > 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil > (ExitUtil.java:terminate(213)) - Terminate called > 1: Block report processor encountered fatal exception: > java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > Exception in thread "Block report processor" 1: Block report processor > encountered fatal exception: java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > > Actual issue found at: > 2021-07-20 10:20:23,101 [Block report processor] ERROR > blockmanagement.BlockManager (BlockManager.java:run(5314)) - > java.lang.AssertionError > java.lang.AssertionError > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312) > > This issue found while working on FGL branch. But, same
[jira] [Assigned] (HDFS-14529) NPE while Loading the Editlogs
[ https://issues.apache.org/jira/browse/HDFS-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena reassigned HDFS-14529: --- Assignee: Wei-Chiu Chuang (was: Ayush Saxena) > NPE while Loading the Editlogs > -- > > Key: HDFS-14529 > URL: https://issues.apache.org/jira/browse/HDFS-14529 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Wei-Chiu Chuang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {noformat} > 2019-05-31 15:15:42,397 ERROR namenode.FSEditLogLoader: Encountered exception > on operation TimesOp [length=0, > path=/testLoadSpace/dir0/dir0/dir0/dir2/_file_9096763, mtime=-1, > atime=1559294343288, opCode=OP_TIMES, txid=18927893] > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:490) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:711) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:286) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:181) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:924) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:771) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1558) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1640) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1725){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16111) Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes at datanodes.
[ https://issues.apache.org/jira/browse/HDFS-16111?focusedWorklogId=628867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628867 ] ASF GitHub Bot logged work on HDFS-16111: - Author: ASF GitHub Bot Created on: 28/Jul/21 04:59 Start Date: 28/Jul/21 04:59 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3175: URL: https://github.com/apache/hadoop/pull/3175#issuecomment-888010597 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 25s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 38m 33s | | trunk passed | | +1 :green_heart: | compile | 1m 53s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 47s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 20s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 48s | | trunk passed | | +1 :green_heart: | javadoc | 1m 20s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 52s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 58s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 18s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 38s | | the patch passed | | +1 :green_heart: | compile | 1m 45s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 45s | | the patch passed | | +1 :green_heart: | compile | 1m 34s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 34s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 4s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 40s | | the patch passed | | +1 :green_heart: | xml | 0m 2s | | The patch has no ill-formed XML file. | | +1 :green_heart: | javadoc | 1m 5s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 45s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 4m 11s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 58s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 418m 8s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3175/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 42s | | The patch does not generate ASF License warnings. | | | | 529m 45s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.web.TestWebHdfsFileSystemContract | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3175/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3175 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell xml | | uname | Linux 6fa4feae10d8 4.15.0-142-generic #146-Ubuntu SMP Tue Apr 13 01:11:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 24a19b0d09670448c2e1326ee4d743cb8d84cba1 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions |
[jira] [Work logged] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff
[ https://issues.apache.org/jira/browse/HDFS-16145?focusedWorklogId=628868=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628868 ] ASF GitHub Bot logged work on HDFS-16145: - Author: ASF GitHub Bot Created on: 28/Jul/21 04:59 Start Date: 28/Jul/21 04:59 Worklog Time Spent: 10m Work Description: mukul1987 merged pull request #3234: URL: https://github.com/apache/hadoop/pull/3234 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628868) Time Spent: 3h 20m (was: 3h 10m) > CopyListing fails with FNF exception with snapshot diff > --- > > Key: HDFS-16145 > URL: https://issues.apache.org/jira/browse/HDFS-16145 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > Distcp with snapshotdiff and with filters, marks a Rename as a delete > opeartion on the target if the rename target is to a directory which is > exluded by the filter. But, in cases, where files/subdirs created/modified > prior to the Rename post the old snapshot will still be present as > modified/created entries in the final copy list. Since, the parent diretory > is marked for deletion, these subsequent create/modify entries should be > ignored while building the final copy list. > With such cases, when the final copy list is built, distcp tries to do a > lookup for each create/modified file in the newer snapshot which will fail > as, the parent dir is already moved to a new location in later snapshot. > > {code:java} > sudo -u kms hadoop key create testkey > hadoop fs -mkdir -p /data/gcgdlknnasg/ > hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/ > hadoop fs -mkdir -p /dest/gcgdlknnasg > hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg > hdfs dfs -mkdir /data/gcgdlknnasg/dir1 > hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ > hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/ > drwxrwxrwt - hdfs supergroup 0 2021-07-16 14:05 > /data/gcgdlknnasg/.Trash > drwxr-xr-x - hdfs supergroup 0 2021-07-16 13:07 > /data/gcgdlknnasg/dir1 > [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# > hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/ > hdfs dfs -rm -r /data/gcgdlknnasg/dir1/ > hdfs dfs -mkdir /data/gcgdlknnasg/dir1/ > ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the > replication schedule. You get into below error and failure of the BDR job. > 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - > java.io.FileNotFoundException: File does not exist: > /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487) > …….. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack
[ https://issues.apache.org/jira/browse/HDFS-16138?focusedWorklogId=628866=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628866 ] ASF GitHub Bot logged work on HDFS-16138: - Author: ASF GitHub Bot Created on: 28/Jul/21 04:52 Start Date: 28/Jul/21 04:52 Worklog Time Spent: 10m Work Description: prasad-acit opened a new pull request #3244: URL: https://github.com/apache/hadoop/pull/3244 HDFS-16138. BlockReportProcessingThread exit doesnt print the acutal stack -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628866) Remaining Estimate: 0h Time Spent: 10m > BlockReportProcessingThread exit doesnt print the acutal stack > -- > > Key: HDFS-16138 > URL: https://issues.apache.org/jira/browse/HDFS-16138 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > BlockReportProcessingThread thread may gets exited with multiple reasons, but > the current logging prints only the exception message with different stack > which is difficult to debug the issue. > > Existing logging: > 2021-07-20 10:20:23,104 [Block report processor] INFO util.ExitUtil > (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report > processor encountered fatal exception: java.lang.AssertionError > 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil > (ExitUtil.java:terminate(213)) - Terminate called > 1: Block report processor encountered fatal exception: > java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > Exception in thread "Block report processor" 1: Block report processor > encountered fatal exception: java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > > Actual issue found at: > 2021-07-20 10:20:23,101 [Block report processor] ERROR > blockmanagement.BlockManager (BlockManager.java:run(5314)) - > java.lang.AssertionError > java.lang.AssertionError > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312) > > This issue found while working on FGL branch. But, same issue can happen in > Trunk also in any error scenario. > > [~hemanthboyina] [~hexiaoqiao] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16138) BlockReportProcessingThread exit doesnt print the acutal stack
[ https://issues.apache.org/jira/browse/HDFS-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16138: -- Labels: pull-request-available (was: ) > BlockReportProcessingThread exit doesnt print the acutal stack > -- > > Key: HDFS-16138 > URL: https://issues.apache.org/jira/browse/HDFS-16138 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > BlockReportProcessingThread thread may gets exited with multiple reasons, but > the current logging prints only the exception message with different stack > which is difficult to debug the issue. > > Existing logging: > 2021-07-20 10:20:23,104 [Block report processor] INFO util.ExitUtil > (ExitUtil.java:terminate(210)) - Exiting with status 1: Block report > processor encountered fatal exception: java.lang.AssertionError > 2021-07-20 10:20:23,104 [Block report processor] ERROR util.ExitUtil > (ExitUtil.java:terminate(213)) - Terminate called > 1: Block report processor encountered fatal exception: > java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > Exception in thread "Block report processor" 1: Block report processor > encountered fatal exception: java.lang.AssertionError > at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:304) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5315) > > Actual issue found at: > 2021-07-20 10:20:23,101 [Block report processor] ERROR > blockmanagement.BlockManager (BlockManager.java:run(5314)) - > java.lang.AssertionError > java.lang.AssertionError > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3480) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:4280) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4202) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4338) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:4305) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:4853) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$2.run(NameNodeRpcServer.java:1657) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.processQueue(BlockManager.java:5334) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5312) > > This issue found while working on FGL branch. But, same issue can happen in > Trunk also in any error scenario. > > [~hemanthboyina] [~hexiaoqiao] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16111) Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes at datanodes.
[ https://issues.apache.org/jira/browse/HDFS-16111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihai Xu updated HDFS-16111: - Description: When we upgraded our hadoop cluster from hadoop 2.6.0 to hadoop 3.2.2, we got failed volume on a lot of datanodes, which cause some missing blocks at that time. Although later on we recovered all the missing blocks by symlinking the path (dfs/dn/current) on the failed volume to a new directory and copying all the data to the new directory, we missed our SLA and it delayed our upgrading process on our production cluster for several hours. When this issue happened, we saw a lot of this exceptions happened before the volumed failed on the datanode: [DataXceiver for client at /[XX.XX.XX.XX:XXX|http://10.104.103.159:33986/] [Receiving block BP-XX-XX.XX.XX.XX-XX:blk_X_XXX]] datanode.DataNode (BlockReceiver.java:(289)) - IOException in BlockReceiver constructor :Possible disk error: Failed to create /XXX/dfs/dn/current/BP-XX-XX.XX.XX.XX-X/tmp/blk_XX. Cause is java.io.IOException: No space left on device at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createNewFile(File.java:1012) at org.apache.hadoop.hdfs.server.datanode.FileIoProvider.createFile(FileIoProvider.java:302) at org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createFileWithExistsCheck(DatanodeUtil.java:69) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createTmpFile(BlockPoolSlice.java:292) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createTmpFile(FsVolumeImpl.java:532) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createTemporary(FsVolumeImpl.java:1254) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1598) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:212) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.getBlockReceiver(DataXceiver.java:1314) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:768) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:291) at java.lang.Thread.run(Thread.java:748) We found this issue happened due to the following two reasons: First the upgrade process added some extra disk storage on the each disk volume of the data node: BlockPoolSliceStorage.doUpgrade ([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java#L445]) is the main upgrade function in the datanode, it will add some extra storage. The extra storage added is all new directories created in /current//current, although all block data file and block meta data file are hard-linked with /current//previous after upgrade. Since there will be a lot of new directories created, this will use some disk space on each disk volume. Second there is a potential bug when picking a disk volume to write a new block file(replica). By default, Hadoop uses RoundRobinVolumeChoosingPolicy, The code to select a disk will check whether the available space on the selected disk is more than the size bytes of block file to store ([https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/RoundRobinVolumeChoosingPolicy.java#L86]) But when creating a new block, there will be two files created: one is the block file blk_, the other is block metadata file blk__.meta, this is the code when finalizing a block, both block file size and meta data file size will be updated: [https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java#L391] the current code only considers the size of block file and doesn't consider the size of block metadata file, when choosing a disk in RoundRobinVolumeChoosingPolicy. There can be a lot of on-going blocks received at the same time, the default maximum number of DataXceiver threads is 4096. This will underestimate the total size needed to write a block, which will potentially cause the above disk full error(No space left on device). Since the size of the block metadata file is not fixed, I suggest to add a configuration( dfs.datanode.round-robin-volume-choosing-policy.additional-available-space ) to safeguard the disk space when choosing a volume to write a new block data in RoundRobinVolumeChoosingPolicy.
[jira] [Commented] (HDFS-14529) NPE while Loading the Editlogs
[ https://issues.apache.org/jira/browse/HDFS-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388406#comment-17388406 ] Wei-Chiu Chuang commented on HDFS-14529: i posted a PR #3243 let me know how you think. The PR does not resolve the issue. It merely makes the exception more graceful (FileNotFoundException instead of NPE). When the bad edit log is written and loaded, it's already too late. NameNode can work around the exception by supplying the startup option -recover. > NPE while Loading the Editlogs > -- > > Key: HDFS-14529 > URL: https://issues.apache.org/jira/browse/HDFS-14529 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {noformat} > 2019-05-31 15:15:42,397 ERROR namenode.FSEditLogLoader: Encountered exception > on operation TimesOp [length=0, > path=/testLoadSpace/dir0/dir0/dir0/dir2/_file_9096763, mtime=-1, > atime=1559294343288, opCode=OP_TIMES, txid=18927893] > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:490) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:711) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:286) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:181) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:924) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:771) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1558) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1640) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1725){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-14529) NPE while Loading the Editlogs
[ https://issues.apache.org/jira/browse/HDFS-14529?focusedWorklogId=628835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628835 ] ASF GitHub Bot logged work on HDFS-14529: - Author: ASF GitHub Bot Created on: 28/Jul/21 02:37 Start Date: 28/Jul/21 02:37 Worklog Time Spent: 10m Work Description: jojochuang opened a new pull request #3243: URL: https://github.com/apache/hadoop/pull/3243 Throw FileNotFoundException instead of NPE if inode is not found. As explained in the corresponding jira, there are two possibilities for this to happen. This PR does not resolve the culprit. I just merely make the exception more graceful. If this problem is ever hit during edit log loading, NameNode can work around by applying the -recover option. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628835) Remaining Estimate: 0h Time Spent: 10m > NPE while Loading the Editlogs > -- > > Key: HDFS-14529 > URL: https://issues.apache.org/jira/browse/HDFS-14529 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > {noformat} > 2019-05-31 15:15:42,397 ERROR namenode.FSEditLogLoader: Encountered exception > on operation TimesOp [length=0, > path=/testLoadSpace/dir0/dir0/dir0/dir2/_file_9096763, mtime=-1, > atime=1559294343288, opCode=OP_TIMES, txid=18927893] > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:490) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:711) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:286) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:181) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:924) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:771) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1558) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1640) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1725){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14529) NPE while Loading the Editlogs
[ https://issues.apache.org/jira/browse/HDFS-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-14529: -- Labels: pull-request-available (was: ) > NPE while Loading the Editlogs > -- > > Key: HDFS-14529 > URL: https://issues.apache.org/jira/browse/HDFS-14529 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {noformat} > 2019-05-31 15:15:42,397 ERROR namenode.FSEditLogLoader: Encountered exception > on operation TimesOp [length=0, > path=/testLoadSpace/dir0/dir0/dir0/dir2/_file_9096763, mtime=-1, > atime=1559294343288, opCode=OP_TIMES, txid=18927893] > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:490) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:711) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:286) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:181) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:924) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:771) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1558) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1640) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1725){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16111) Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes at datanodes.
[ https://issues.apache.org/jira/browse/HDFS-16111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388404#comment-17388404 ] Zhihai Xu commented on HDFS-16111: -- Thanks [~weichiu] for the review and committing the patch! Thanks [~ywskycn] for the review! > Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes > at datanodes. > --- > > Key: HDFS-16111 > URL: https://issues.apache.org/jira/browse/HDFS-16111 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Zhihai Xu >Assignee: Zhihai Xu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > When we upgraded our hadoop cluster from hadoop 2.6.0 to hadoop 3.2.2, we got > failed volume on a lot of datanodes, which cause some missing blocks at that > time. Although later on we recovered all the missing blocks by symlinking the > path (dfs/dn/current) on the failed volume to a new directory and copying all > the data to the new directory, we missed our SLA and it delayed our upgrading > process on our production cluster for several hours. > When this issue happened, we saw a lot of this exceptions happened before the > volumed failed on the datanode: > [DataXceiver for client at /[XX.XX.XX.XX:XXX|http://10.104.103.159:33986/] > [Receiving block BP-XX-XX.XX.XX.XX-XX:blk_X_XXX]] > datanode.DataNode (BlockReceiver.java:(289)) - IOException in > BlockReceiver constructor :Possible disk error: Failed to create > /XXX/dfs/dn/current/BP-XX-XX.XX.XX.XX-X/tmp/blk_XX. Cause > is > java.io.IOException: No space left on device > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1012) > at > org.apache.hadoop.hdfs.server.datanode.FileIoProvider.createFile(FileIoProvider.java:302) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createFileWithExistsCheck(DatanodeUtil.java:69) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createTmpFile(BlockPoolSlice.java:292) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createTmpFile(FsVolumeImpl.java:532) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createTemporary(FsVolumeImpl.java:1254) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1598) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:212) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.getBlockReceiver(DataXceiver.java:1314) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:768) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:291) > at java.lang.Thread.run(Thread.java:748) > > We found this issue happened due to the following two reasons: > First the upgrade process added some extra disk storage on the each disk > volume of the data node: > BlockPoolSliceStorage.doUpgrade > (https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java#L445) > is the main upgrade function in the datanode, it will add some extra > storage. The extra storage added is all new directories created in > /current//current, although all block data file and block meta data > file are hard-linked with /current//previous after upgrade. Since there > will be a lot of new directories created, this will use some disk space on > each disk volume. > > Second there is a potential bug when picking a disk volume to write a new > block file(replica). By default, Hadoop uses RoundRobinVolumeChoosingPolicy, > The code to select a disk will check whether the available space on the > selected disk is more than the size bytes of block file to store > (https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/RoundRobinVolumeChoosingPolicy.java#L86) > But when creating a new block, there will be two files created: one is the > block file blk_, the other is block metadata file blk__.meta, > this is the code when finalizing a block, both block file size and meta data > file size will be updated: >
[jira] [Resolved] (HDFS-16111) Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes at datanodes.
[ https://issues.apache.org/jira/browse/HDFS-16111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HDFS-16111. Fix Version/s: 3.4.0 Resolution: Fixed Thanks! > Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes > at datanodes. > --- > > Key: HDFS-16111 > URL: https://issues.apache.org/jira/browse/HDFS-16111 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Zhihai Xu >Assignee: Zhihai Xu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > When we upgraded our hadoop cluster from hadoop 2.6.0 to hadoop 3.2.2, we got > failed volume on a lot of datanodes, which cause some missing blocks at that > time. Although later on we recovered all the missing blocks by symlinking the > path (dfs/dn/current) on the failed volume to a new directory and copying all > the data to the new directory, we missed our SLA and it delayed our upgrading > process on our production cluster for several hours. > When this issue happened, we saw a lot of this exceptions happened before the > volumed failed on the datanode: > [DataXceiver for client at /[XX.XX.XX.XX:XXX|http://10.104.103.159:33986/] > [Receiving block BP-XX-XX.XX.XX.XX-XX:blk_X_XXX]] > datanode.DataNode (BlockReceiver.java:(289)) - IOException in > BlockReceiver constructor :Possible disk error: Failed to create > /XXX/dfs/dn/current/BP-XX-XX.XX.XX.XX-X/tmp/blk_XX. Cause > is > java.io.IOException: No space left on device > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1012) > at > org.apache.hadoop.hdfs.server.datanode.FileIoProvider.createFile(FileIoProvider.java:302) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createFileWithExistsCheck(DatanodeUtil.java:69) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createTmpFile(BlockPoolSlice.java:292) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createTmpFile(FsVolumeImpl.java:532) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createTemporary(FsVolumeImpl.java:1254) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1598) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:212) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.getBlockReceiver(DataXceiver.java:1314) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:768) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:291) > at java.lang.Thread.run(Thread.java:748) > > We found this issue happened due to the following two reasons: > First the upgrade process added some extra disk storage on the each disk > volume of the data node: > BlockPoolSliceStorage.doUpgrade > (https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java#L445) > is the main upgrade function in the datanode, it will add some extra > storage. The extra storage added is all new directories created in > /current//current, although all block data file and block meta data > file are hard-linked with /current//previous after upgrade. Since there > will be a lot of new directories created, this will use some disk space on > each disk volume. > > Second there is a potential bug when picking a disk volume to write a new > block file(replica). By default, Hadoop uses RoundRobinVolumeChoosingPolicy, > The code to select a disk will check whether the available space on the > selected disk is more than the size bytes of block file to store > (https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/RoundRobinVolumeChoosingPolicy.java#L86) > But when creating a new block, there will be two files created: one is the > block file blk_, the other is block metadata file blk__.meta, > this is the code when finalizing a block, both block file size and meta data > file size will be updated: >
[jira] [Work logged] (HDFS-16111) Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes at datanodes.
[ https://issues.apache.org/jira/browse/HDFS-16111?focusedWorklogId=628832=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628832 ] ASF GitHub Bot logged work on HDFS-16111: - Author: ASF GitHub Bot Created on: 28/Jul/21 02:19 Start Date: 28/Jul/21 02:19 Worklog Time Spent: 10m Work Description: jojochuang merged pull request #3175: URL: https://github.com/apache/hadoop/pull/3175 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628832) Time Spent: 2h 40m (was: 2.5h) > Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes > at datanodes. > --- > > Key: HDFS-16111 > URL: https://issues.apache.org/jira/browse/HDFS-16111 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Zhihai Xu >Assignee: Zhihai Xu >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > When we upgraded our hadoop cluster from hadoop 2.6.0 to hadoop 3.2.2, we got > failed volume on a lot of datanodes, which cause some missing blocks at that > time. Although later on we recovered all the missing blocks by symlinking the > path (dfs/dn/current) on the failed volume to a new directory and copying all > the data to the new directory, we missed our SLA and it delayed our upgrading > process on our production cluster for several hours. > When this issue happened, we saw a lot of this exceptions happened before the > volumed failed on the datanode: > [DataXceiver for client at /[XX.XX.XX.XX:XXX|http://10.104.103.159:33986/] > [Receiving block BP-XX-XX.XX.XX.XX-XX:blk_X_XXX]] > datanode.DataNode (BlockReceiver.java:(289)) - IOException in > BlockReceiver constructor :Possible disk error: Failed to create > /XXX/dfs/dn/current/BP-XX-XX.XX.XX.XX-X/tmp/blk_XX. Cause > is > java.io.IOException: No space left on device > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1012) > at > org.apache.hadoop.hdfs.server.datanode.FileIoProvider.createFile(FileIoProvider.java:302) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createFileWithExistsCheck(DatanodeUtil.java:69) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createTmpFile(BlockPoolSlice.java:292) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createTmpFile(FsVolumeImpl.java:532) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createTemporary(FsVolumeImpl.java:1254) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1598) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:212) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.getBlockReceiver(DataXceiver.java:1314) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:768) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:291) > at java.lang.Thread.run(Thread.java:748) > > We found this issue happened due to the following two reasons: > First the upgrade process added some extra disk storage on the each disk > volume of the data node: > BlockPoolSliceStorage.doUpgrade > (https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java#L445) > is the main upgrade function in the datanode, it will add some extra > storage. The extra storage added is all new directories created in > /current//current, although all block data file and block meta data > file are hard-linked with /current//previous after upgrade. Since there > will be a lot of new directories created, this will use some disk space on > each disk volume. > > Second there is a potential bug when picking a disk volume to write a new > block file(replica). By default, Hadoop uses RoundRobinVolumeChoosingPolicy, > The code to select a disk will check whether the
[jira] [Work logged] (HDFS-16137) Improve the comments related to FairCallQueue#queues
[ https://issues.apache.org/jira/browse/HDFS-16137?focusedWorklogId=628826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628826 ] ASF GitHub Bot logged work on HDFS-16137: - Author: ASF GitHub Bot Created on: 28/Jul/21 01:51 Start Date: 28/Jul/21 01:51 Worklog Time Spent: 10m Work Description: jianghuazhu removed a comment on pull request #3226: URL: https://github.com/apache/hadoop/pull/3226#issuecomment-887946612 @virajjasani , I submitted some new code, if possible, I need your help review. thank you very much. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628826) Time Spent: 1h 40m (was: 1.5h) > Improve the comments related to FairCallQueue#queues > > > Key: HDFS-16137 > URL: https://issues.apache.org/jira/browse/HDFS-16137 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ipc >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > FairCallQueue#queues related comments are too simple: >/* The queues */ >private final ArrayList> queues; > Can not visually see the meaning of FairCallQueue#queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16137) Improve the comments related to FairCallQueue#queues
[ https://issues.apache.org/jira/browse/HDFS-16137?focusedWorklogId=628823=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628823 ] ASF GitHub Bot logged work on HDFS-16137: - Author: ASF GitHub Bot Created on: 28/Jul/21 01:47 Start Date: 28/Jul/21 01:47 Worklog Time Spent: 10m Work Description: jianghuazhu commented on pull request #3226: URL: https://github.com/apache/hadoop/pull/3226#issuecomment-887946612 @virajjasani , I submitted some new code, if possible, I need your help review. thank you very much. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628823) Time Spent: 1.5h (was: 1h 20m) > Improve the comments related to FairCallQueue#queues > > > Key: HDFS-16137 > URL: https://issues.apache.org/jira/browse/HDFS-16137 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ipc >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Minor > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > FairCallQueue#queues related comments are too simple: >/* The queues */ >private final ArrayList> queues; > Can not visually see the meaning of FairCallQueue#queues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16143) TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
[ https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=628812=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628812 ] ASF GitHub Bot logged work on HDFS-16143: - Author: ASF GitHub Bot Created on: 28/Jul/21 00:55 Start Date: 28/Jul/21 00:55 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3235: URL: https://github.com/apache/hadoop/pull/3235#issuecomment-887929587 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 46s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 33m 54s | | trunk passed | | +1 :green_heart: | compile | 1m 29s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 17s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 1s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 31s | | trunk passed | | +1 :green_heart: | javadoc | 0m 59s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 35s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 24s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 4s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 14s | | the patch passed | | +1 :green_heart: | compile | 1m 14s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 14s | | the patch passed | | +1 :green_heart: | compile | 1m 8s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 8s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 51s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 14s | | the patch passed | | +1 :green_heart: | javadoc | 0m 51s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 21s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 33s | | the patch passed | | +1 :green_heart: | shadedclient | 18m 54s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 457m 45s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/9/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 47s | | The patch does not generate ASF License warnings. | | | | 550m 29s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.TestViewDistributedFileSystemContract | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand | | | hadoop.hdfs.TestHDFSFileSystemContract | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/9/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3235 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux ed07561dd26b 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 3a731fa518ac7a3b242e9099c0945ddf76a1eb47 | | Default Java | Private
[jira] [Work logged] (HDFS-16143) TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
[ https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=628797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628797 ] ASF GitHub Bot logged work on HDFS-16143: - Author: ASF GitHub Bot Created on: 28/Jul/21 00:02 Start Date: 28/Jul/21 00:02 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3235: URL: https://github.com/apache/hadoop/pull/3235#issuecomment-887910880 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 16s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 40m 24s | | trunk passed | | +1 :green_heart: | compile | 1m 52s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 30s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 8s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 44s | | trunk passed | | +1 :green_heart: | javadoc | 1m 10s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 41s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 4m 13s | | trunk passed | | +1 :green_heart: | shadedclient | 21m 39s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 25s | | the patch passed | | +1 :green_heart: | compile | 1m 37s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 37s | | the patch passed | | +1 :green_heart: | compile | 1m 26s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 25s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 8s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 41s | | the patch passed | | +1 :green_heart: | javadoc | 1m 7s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 45s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 4m 45s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 23s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 385m 30s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/10/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 55s | | The patch does not generate ASF License warnings. | | | | 496m 15s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.TestHDFSFileSystemContract | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/10/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3235 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux cdeacbb7419c 4.15.0-142-generic #146-Ubuntu SMP Tue Apr 13 01:11:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 3a731fa518ac7a3b242e9099c0945ddf76a1eb47 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/10/testReport/ | | Max. process+thread count | 1999 (vs.
[jira] [Commented] (HDFS-16144) Revert HDFS-15372 (Files in snapshots no longer see attribute provider permissions)
[ https://issues.apache.org/jira/browse/HDFS-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388361#comment-17388361 ] Hadoop QA commented on HDFS-16144: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 28s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 55s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 38s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 21m 19s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 3m 7s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 15s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 7s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 49s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/688/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 64 unchanged - 0 fixed = 66 total (was 64) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/688/artifact/out/whitespace-tabs.txt{color} | {color:red} The patch 4 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 9s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green}
[jira] [Commented] (HDFS-15175) Multiple CloseOp shared block instance causes the standby namenode to crash when rolling editlog
[ https://issues.apache.org/jira/browse/HDFS-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388275#comment-17388275 ] Stephen O'Donnell commented on HDFS-15175: -- This is a tricky problem. If I understand correctly the issue is that with async edit logging, some number of edits are pending in memory waiting to be flushed to disk. Those edit Ops have a reference to a block object which is mutable and can change before they are flushed. If that block object changes, then the information in all the pending edits will be changed too. This goes further than just the block objects, as even the list of blocks for a given file is a mutable list and the edit op holds a reference to it. Therefore if the list of blocks changes, which is written to the edit log will change too. One solution (which is not really feasible) is to make the Block object immutable, so that when it is changed a new instance is created, but that would be difficult - it is used in many places. It would be more efficient than deep copying every object for edits, but perhaps not in other parts of the NN. I don't see a better way than making a copy of the mutable data when it is stored into the pending edit op. > Multiple CloseOp shared block instance causes the standby namenode to crash > when rolling editlog > > > Key: HDFS-15175 > URL: https://issues.apache.org/jira/browse/HDFS-15175 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.9.2 >Reporter: Yicong Cai >Assignee: Wan Chang >Priority: Critical > Labels: NameNode > Attachments: HDFS-15175-trunk.1.patch > > > > {panel:title=Crash exception} > 2020-02-16 09:24:46,426 [507844305] - ERROR [Edit log > tailer:FSEditLogLoader@245] - Encountered exception on operation CloseOp > [length=0, inodeId=0, path=..., replication=3, mtime=1581816138774, > atime=1581814760398, blockSize=536870912, blocks=[blk_5568434562_4495417845], > permissions=da_music:hdfs:rw-r-, aclEntries=null, clientName=, > clientMachine=, overwrite=false, storagePolicyId=0, opCode=OP_CLOSE, > txid=32625024993] > java.io.IOException: File is not under construction: .. > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:442) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:237) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:146) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:891) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:872) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:262) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1873) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:479) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:361) > {panel} > > {panel:title=Editlog} > > OP_REASSIGN_LEASE > > 32625021150 > DFSClient_NONMAPREDUCE_-969060727_197760 > .. > DFSClient_NONMAPREDUCE_1000868229_201260 > > > .. > > OP_CLOSE > > 32625023743 > 0 > 0 > .. > 3 > 1581816135883 > 1581814760398 > 536870912 > > > false > > 5568434562 > 185818644 > 4495417845 > > > da_music > hdfs > 416 > > > > .. > > OP_TRUNCATE > > 32625024049 > .. > DFSClient_NONMAPREDUCE_1000868229_201260 > .. > 185818644 > 1581816136336 > > 5568434562 > 185818648 > 4495417845 > > > > .. > > OP_CLOSE > > 32625024993 > 0 > 0 > .. > 3 > 1581816138774 > 1581814760398 > 536870912 > > > false > > 5568434562 > 185818644 > 4495417845 > > > da_music > hdfs > 416 > > > > {panel} > > > The block size should be 185818648 in the first CloseOp. When truncate is > used, the block size becomes 185818644. The CloseOp/TruncateOp/CloseOp is > synchronized to the JournalNode in the same batch. The block used by CloseOp > twice is the same instance, which causes the first CloseOp has wrong block > size. When SNN rolling Editlog, TruncateOp does not make the file to
[jira] [Work logged] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff
[ https://issues.apache.org/jira/browse/HDFS-16145?focusedWorklogId=628673=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628673 ] ASF GitHub Bot logged work on HDFS-16145: - Author: ASF GitHub Bot Created on: 27/Jul/21 18:49 Start Date: 27/Jul/21 18:49 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3234: URL: https://github.com/apache/hadoop/pull/3234#issuecomment-887750622 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 39s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 31m 21s | | trunk passed | | +1 :green_heart: | compile | 0m 34s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 0m 31s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 0m 27s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 36s | | trunk passed | | +1 :green_heart: | javadoc | 0m 30s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 0m 29s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 0m 49s | | trunk passed | | +1 :green_heart: | shadedclient | 14m 4s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 25s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 0m 25s | | the patch passed | | +1 :green_heart: | compile | 0m 23s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 0m 23s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 15s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 25s | | the patch passed | | +1 :green_heart: | javadoc | 0m 20s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 0m 18s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 0m 49s | | the patch passed | | +1 :green_heart: | shadedclient | 13m 53s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 21m 49s | | hadoop-distcp in the patch passed. | | +1 :green_heart: | asflicense | 0m 36s | | The patch does not generate ASF License warnings. | | | | 91m 33s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3234/7/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3234 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 7f8a044133d6 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / f86f59cd02898af1bd3f78004a73440229de67ae | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3234/7/testReport/ | | Max. process+thread count | 545 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3234/7/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was
[jira] [Commented] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning
[ https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388217#comment-17388217 ] Renukaprasad C commented on HDFS-14703: --- Thanks [Daryn Sharp|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=daryn] for the review & comments. Thanks [Xing Lin|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=xinglin] for quick update. # Was the entry point for the calls via the rpc server, fsn, fsdir, etc? Relevant since end-to-end benchmarking rarely matches microbenchmarks. We have run the benchmarking took in standalone mode with file:// schema. With this we would be able to achieve 50k-60k throughput. # What is “30-40%” improvement? How many ops/sec before and after? When we test in standalone mode, we found an average of 30% improvement with mkdir op. https://issues.apache.org/jira/browse/HDFS-14703?focusedCommentId=17346002=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17346002 # What impact did it have on gc/min and gc time? These are often hidden killers of performance when not taken into consideration. We have noticed that there is no CPU bottleneck with the patch. These metrics we need to capture yet. We shall check further and publish if any impact on GC with the patch. We would like [~shv] to clarify further. > NameNode Fine-Grained Locking via Metadata Partitioning > --- > > Key: HDFS-14703 > URL: https://issues.apache.org/jira/browse/HDFS-14703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Priority: Major > Attachments: 001-partitioned-inodeMap-POC.tar.gz, > 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, > NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf > > > We target to enable fine-grained locking by splitting the in-memory namespace > into multiple partitions each having a separate lock. Intended to improve > performance of NameNode write operations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16137) Improve the comments related to FairCallQueue#queues
[ https://issues.apache.org/jira/browse/HDFS-16137?focusedWorklogId=628629=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628629 ] ASF GitHub Bot logged work on HDFS-16137: - Author: ASF GitHub Bot Created on: 27/Jul/21 17:27 Start Date: 27/Jul/21 17:27 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3226: URL: https://github.com/apache/hadoop/pull/3226#issuecomment-887695402 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 50s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | -1 :x: | mvninstall | 31m 14s | [/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3226/3/artifact/out/branch-mvninstall-root.txt) | root in trunk failed. | | -1 :x: | compile | 0m 30s | [/branch-compile-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3226/3/artifact/out/branch-compile-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt) | root in trunk failed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04. | | +1 :green_heart: | compile | 24m 25s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 9s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 49s | | trunk passed | | +1 :green_heart: | javadoc | 1m 18s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 40s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 2m 53s | | trunk passed | | +1 :green_heart: | shadedclient | 18m 3s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 55s | | the patch passed | | +1 :green_heart: | compile | 21m 13s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | -1 :x: | javac | 21m 13s | [/results-compile-javac-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3226/3/artifact/out/results-compile-javac-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt) | root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1919 new + 0 unchanged - 0 fixed = 1919 total (was 0) | | +1 :green_heart: | compile | 18m 54s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 18m 54s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 4s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 38s | | the patch passed | | +1 :green_heart: | javadoc | 1m 6s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 45s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 2m 49s | | the patch passed | | +1 :green_heart: | shadedclient | 17m 21s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 17m 52s | | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 1m 0s | | The patch does not generate ASF License warnings. | | | | 169m 44s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3226/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3226 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 22d87aba7968 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh
[jira] [Resolved] (HDFS-16119) start balancer with parameters -hotBlockTimeInterval xxx is invalid
[ https://issues.apache.org/jira/browse/HDFS-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Boyina resolved HDFS-16119. --- Fix Version/s: 3.4.0 Resolution: Fixed > start balancer with parameters -hotBlockTimeInterval xxx is invalid > --- > > Key: HDFS-16119 > URL: https://issues.apache.org/jira/browse/HDFS-16119 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: jiaguodong >Assignee: jiaguodong >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > > when start balancer with parameters -hotBlockTimeInterval xxx, it is invalid. > but set hdfs-site.xml is valid. > > dfs.balancer.getBlocks.hot-time-interval > 3600 > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16119) start balancer with parameters -hotBlockTimeInterval xxx is invalid
[ https://issues.apache.org/jira/browse/HDFS-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Boyina reassigned HDFS-16119: - Assignee: jiaguodong > start balancer with parameters -hotBlockTimeInterval xxx is invalid > --- > > Key: HDFS-16119 > URL: https://issues.apache.org/jira/browse/HDFS-16119 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: jiaguodong >Assignee: jiaguodong >Priority: Minor > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > > when start balancer with parameters -hotBlockTimeInterval xxx, it is invalid. > but set hdfs-site.xml is valid. > > dfs.balancer.getBlocks.hot-time-interval > 3600 > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16119) start balancer with parameters -hotBlockTimeInterval xxx is invalid
[ https://issues.apache.org/jira/browse/HDFS-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388184#comment-17388184 ] Hemanth Boyina commented on HDFS-16119: --- committed to trunk , thanks for the contribution [~jiaguodong] > start balancer with parameters -hotBlockTimeInterval xxx is invalid > --- > > Key: HDFS-16119 > URL: https://issues.apache.org/jira/browse/HDFS-16119 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: jiaguodong >Assignee: jiaguodong >Priority: Minor > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > > when start balancer with parameters -hotBlockTimeInterval xxx, it is invalid. > but set hdfs-site.xml is valid. > > dfs.balancer.getBlocks.hot-time-interval > 3600 > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16119) start balancer with parameters -hotBlockTimeInterval xxx is invalid
[ https://issues.apache.org/jira/browse/HDFS-16119?focusedWorklogId=628611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628611 ] ASF GitHub Bot logged work on HDFS-16119: - Author: ASF GitHub Bot Created on: 27/Jul/21 16:55 Start Date: 27/Jul/21 16:55 Worklog Time Spent: 10m Work Description: hemanthboyina commented on pull request #3185: URL: https://github.com/apache/hadoop/pull/3185#issuecomment-887674189 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628611) Time Spent: 2h 40m (was: 2.5h) > start balancer with parameters -hotBlockTimeInterval xxx is invalid > --- > > Key: HDFS-16119 > URL: https://issues.apache.org/jira/browse/HDFS-16119 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: jiaguodong >Priority: Minor > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > > when start balancer with parameters -hotBlockTimeInterval xxx, it is invalid. > but set hdfs-site.xml is valid. > > dfs.balancer.getBlocks.hot-time-interval > 3600 > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16119) start balancer with parameters -hotBlockTimeInterval xxx is invalid
[ https://issues.apache.org/jira/browse/HDFS-16119?focusedWorklogId=628612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628612 ] ASF GitHub Bot logged work on HDFS-16119: - Author: ASF GitHub Bot Created on: 27/Jul/21 16:55 Start Date: 27/Jul/21 16:55 Worklog Time Spent: 10m Work Description: hemanthboyina merged pull request #3185: URL: https://github.com/apache/hadoop/pull/3185 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628612) Time Spent: 2h 50m (was: 2h 40m) > start balancer with parameters -hotBlockTimeInterval xxx is invalid > --- > > Key: HDFS-16119 > URL: https://issues.apache.org/jira/browse/HDFS-16119 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: jiaguodong >Priority: Minor > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > > when start balancer with parameters -hotBlockTimeInterval xxx, it is invalid. > but set hdfs-site.xml is valid. > > dfs.balancer.getBlocks.hot-time-interval > 3600 > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16143) TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
[ https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=628601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628601 ] ASF GitHub Bot logged work on HDFS-16143: - Author: ASF GitHub Bot Created on: 27/Jul/21 16:45 Start Date: 27/Jul/21 16:45 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3235: URL: https://github.com/apache/hadoop/pull/3235#issuecomment-887667760 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 8s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 30m 38s | | trunk passed | | +1 :green_heart: | compile | 1m 23s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 18s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 2s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 27s | | trunk passed | | +1 :green_heart: | javadoc | 0m 57s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 27s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 5s | | trunk passed | | +1 :green_heart: | shadedclient | 16m 16s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 12s | | the patch passed | | +1 :green_heart: | compile | 1m 15s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 15s | | the patch passed | | +1 :green_heart: | compile | 1m 6s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 6s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 51s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 11s | | the patch passed | | +1 :green_heart: | javadoc | 0m 47s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 23s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 7s | | the patch passed | | +1 :green_heart: | shadedclient | 16m 14s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 457m 11s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/7/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 46s | | The patch does not generate ASF License warnings. | | | | 541m 54s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.TestViewDistributedFileSystemContract | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS | | | hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor | | | hadoop.hdfs.web.TestWebHdfsFileSystemContract | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/7/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3235 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux c44ea80f4045 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 3a731fa518ac7a3b242e9099c0945ddf76a1eb47 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | |
[jira] [Commented] (HDFS-16144) Revert HDFS-15372 (Files in snapshots no longer see attribute provider permissions)
[ https://issues.apache.org/jira/browse/HDFS-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388168#comment-17388168 ] Shashikant Banerjee commented on HDFS-16144: The patch v3 looks good . +1 > Revert HDFS-15372 (Files in snapshots no longer see attribute provider > permissions) > --- > > Key: HDFS-16144 > URL: https://issues.apache.org/jira/browse/HDFS-16144 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-16144.001.patch, HDFS-16144.002.patch, > HDFS-16144.003.patch > > > In HDFS-15372, I noted a change in behaviour between Hadoop 2 and Hadoop 3. > When a user accesses a file in a snapshot, if an attribute provider is > configured it would see the original file path (ie no .snapshot folder) in > Hadoop 2, but it would see the snapshot path in Hadoop 3. > HDFS-15372 changed this back, but I noted at the time it may make sense for > the provider to see the actual snapshot path instead. > Recently we discovered HDFS-16132 where the HDFS-15372 does not work > correctly. At this stage I believe it is better to revert HDFS-15372 as the > fix to this issue is probably not trivial and allow providers to see the > actual path the user accessed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16144) Revert HDFS-15372 (Files in snapshots no longer see attribute provider permissions)
[ https://issues.apache.org/jira/browse/HDFS-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen O'Donnell updated HDFS-16144: - Attachment: HDFS-16144.003.patch > Revert HDFS-15372 (Files in snapshots no longer see attribute provider > permissions) > --- > > Key: HDFS-16144 > URL: https://issues.apache.org/jira/browse/HDFS-16144 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-16144.001.patch, HDFS-16144.002.patch, > HDFS-16144.003.patch > > > In HDFS-15372, I noted a change in behaviour between Hadoop 2 and Hadoop 3. > When a user accesses a file in a snapshot, if an attribute provider is > configured it would see the original file path (ie no .snapshot folder) in > Hadoop 2, but it would see the snapshot path in Hadoop 3. > HDFS-15372 changed this back, but I noted at the time it may make sense for > the provider to see the actual snapshot path instead. > Recently we discovered HDFS-16132 where the HDFS-15372 does not work > correctly. At this stage I believe it is better to revert HDFS-15372 as the > fix to this issue is probably not trivial and allow providers to see the > actual path the user accessed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16143) TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
[ https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=628493=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628493 ] ASF GitHub Bot logged work on HDFS-16143: - Author: ASF GitHub Bot Created on: 27/Jul/21 14:23 Start Date: 27/Jul/21 14:23 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3235: URL: https://github.com/apache/hadoop/pull/3235#issuecomment-887557032 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 21m 54s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 13s | | trunk passed | | +1 :green_heart: | compile | 1m 29s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 22s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 0m 56s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 27s | | trunk passed | | +1 :green_heart: | javadoc | 0m 58s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 30s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 44s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 50s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 26s | | the patch passed | | +1 :green_heart: | compile | 1m 31s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 31s | | the patch passed | | +1 :green_heart: | compile | 1m 20s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 20s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 55s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 25s | | the patch passed | | +1 :green_heart: | javadoc | 0m 55s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 30s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 48s | | the patch passed | | +1 :green_heart: | shadedclient | 20m 12s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 271m 59s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 15s | | The patch does not generate ASF License warnings. | | | | 393m 34s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor | | | hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand | | | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks | | | hadoop.hdfs.server.diskbalancer.TestDiskBalancer | | | hadoop.hdfs.server.blockmanagement.TestBlockInfoStriped | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/8/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3235 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 18685fa855ee 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 06:57:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 3a731fa518ac7a3b242e9099c0945ddf76a1eb47 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions |
[jira] [Work logged] (HDFS-16111) Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes at datanodes.
[ https://issues.apache.org/jira/browse/HDFS-16111?focusedWorklogId=628374=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628374 ] ASF GitHub Bot logged work on HDFS-16111: - Author: ASF GitHub Bot Created on: 27/Jul/21 11:26 Start Date: 27/Jul/21 11:26 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3175: URL: https://github.com/apache/hadoop/pull/3175#issuecomment-887432654 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 47s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 30m 40s | | trunk passed | | +1 :green_heart: | compile | 1m 23s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 18s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 4s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 25s | | trunk passed | | +1 :green_heart: | javadoc | 0m 57s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 32s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 12s | | trunk passed | | +1 :green_heart: | shadedclient | 16m 21s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 11s | | the patch passed | | +1 :green_heart: | compile | 1m 14s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | -1 :x: | javac | 1m 14s | [/results-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3175/3/artifact/out/results-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt) | hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 467 unchanged - 1 fixed = 468 total (was 468) | | +1 :green_heart: | compile | 1m 9s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | -1 :x: | javac | 1m 9s | [/results-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3175/3/artifact/out/results-compile-javac-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt) | hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 451 unchanged - 1 fixed = 452 total (was 452) | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 54s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3175/3/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 338 unchanged - 0 fixed = 341 total (was 338) | | +1 :green_heart: | mvnsite | 1m 14s | | the patch passed | | +1 :green_heart: | xml | 0m 1s | | The patch has no ill-formed XML file. | | +1 :green_heart: | javadoc | 0m 49s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 23s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 11s | | the patch passed | | +1 :green_heart: | shadedclient | 16m 13s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 237m 2s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 45s | | The patch does not generate ASF License warnings. | | | | 321m 31s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base:
[jira] [Work logged] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff
[ https://issues.apache.org/jira/browse/HDFS-16145?focusedWorklogId=628266=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628266 ] ASF GitHub Bot logged work on HDFS-16145: - Author: ASF GitHub Bot Created on: 27/Jul/21 08:46 Start Date: 27/Jul/21 08:46 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3234: URL: https://github.com/apache/hadoop/pull/3234#issuecomment-887330242 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 16m 24s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 30m 47s | | trunk passed | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 0m 31s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 0m 27s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 37s | | trunk passed | | +1 :green_heart: | javadoc | 0m 29s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 0m 29s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 0m 50s | | trunk passed | | +1 :green_heart: | shadedclient | 14m 4s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 25s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 0m 25s | | the patch passed | | +1 :green_heart: | compile | 0m 24s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 0m 24s | | the patch passed | | +1 :green_heart: | blanks | 0m 1s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 17s | [/results-checkstyle-hadoop-tools_hadoop-distcp.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3234/6/artifact/out/results-checkstyle-hadoop-tools_hadoop-distcp.txt) | hadoop-tools/hadoop-distcp: The patch generated 50 new + 26 unchanged - 0 fixed = 76 total (was 26) | | +1 :green_heart: | mvnsite | 0m 26s | | the patch passed | | +1 :green_heart: | javadoc | 0m 21s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 0m 18s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 0m 48s | | the patch passed | | +1 :green_heart: | shadedclient | 13m 43s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 41m 0s | | hadoop-distcp in the patch passed. | | +1 :green_heart: | asflicense | 0m 36s | | The patch does not generate ASF License warnings. | | | | 125m 22s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3234/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3234 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 14f680336897 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 5ad002a4377fc69f0a4b31c42cd2e9aceab2e942 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3234/6/testReport/ | | Max. process+thread count | 689 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-distcp U:
[jira] [Commented] (HDFS-16144) Revert HDFS-15372 (Files in snapshots no longer see attribute provider permissions)
[ https://issues.apache.org/jira/browse/HDFS-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387869#comment-17387869 ] Shashikant Banerjee commented on HDFS-16144: +1 > Revert HDFS-15372 (Files in snapshots no longer see attribute provider > permissions) > --- > > Key: HDFS-16144 > URL: https://issues.apache.org/jira/browse/HDFS-16144 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-16144.001.patch, HDFS-16144.002.patch > > > In HDFS-15372, I noted a change in behaviour between Hadoop 2 and Hadoop 3. > When a user accesses a file in a snapshot, if an attribute provider is > configured it would see the original file path (ie no .snapshot folder) in > Hadoop 2, but it would see the snapshot path in Hadoop 3. > HDFS-15372 changed this back, but I noted at the time it may make sense for > the provider to see the actual snapshot path instead. > Recently we discovered HDFS-16132 where the HDFS-15372 does not work > correctly. At this stage I believe it is better to revert HDFS-15372 as the > fix to this issue is probably not trivial and allow providers to see the > actual path the user accessed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16139) Update BPServiceActor Scheduler's nextBlockReportTime atomically
[ https://issues.apache.org/jira/browse/HDFS-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-16139: --- Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks [~vjasani] for your report and contribution! > Update BPServiceActor Scheduler's nextBlockReportTime atomically > > > Key: HDFS-16139 > URL: https://issues.apache.org/jira/browse/HDFS-16139 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > BPServiceActor#Scheduler's nextBlockReportTime is declared volatile and it > can be assigned/read by testing threads (through BPServiceActor#triggerXXX) > as well as by actor threads. Hence it is declared volatile but it is still > assigned non-atomically > e.g > {code:java} > if (factor != 0) { > nextBlockReportTime += factor * blockReportIntervalMs; > } else { > // If the difference between the present time and the scheduled > // time is very less, the factor can be 0, so in that case, we can > // ignore that negligible time, spent while sending the BRss and > // schedule the next BR after the blockReportInterval. > nextBlockReportTime += blockReportIntervalMs; > } > {code} > We should convert it to AtomicLong to take care of concurrent assignments > while making sure that it is assigned atomically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16139) Update BPServiceActor Scheduler's nextBlockReportTime atomically
[ https://issues.apache.org/jira/browse/HDFS-16139?focusedWorklogId=628194=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628194 ] ASF GitHub Bot logged work on HDFS-16139: - Author: ASF GitHub Bot Created on: 27/Jul/21 06:57 Start Date: 27/Jul/21 06:57 Worklog Time Spent: 10m Work Description: Hexiaoqiao merged pull request #3228: URL: https://github.com/apache/hadoop/pull/3228 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628194) Time Spent: 0.5h (was: 20m) > Update BPServiceActor Scheduler's nextBlockReportTime atomically > > > Key: HDFS-16139 > URL: https://issues.apache.org/jira/browse/HDFS-16139 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > BPServiceActor#Scheduler's nextBlockReportTime is declared volatile and it > can be assigned/read by testing threads (through BPServiceActor#triggerXXX) > as well as by actor threads. Hence it is declared volatile but it is still > assigned non-atomically > e.g > {code:java} > if (factor != 0) { > nextBlockReportTime += factor * blockReportIntervalMs; > } else { > // If the difference between the present time and the scheduled > // time is very less, the factor can be 0, so in that case, we can > // ignore that negligible time, spent while sending the BRss and > // schedule the next BR after the blockReportInterval. > nextBlockReportTime += blockReportIntervalMs; > } > {code} > We should convert it to AtomicLong to take care of concurrent assignments > while making sure that it is assigned atomically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16143) TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
[ https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=628181=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628181 ] ASF GitHub Bot logged work on HDFS-16143: - Author: ASF GitHub Bot Created on: 27/Jul/21 06:27 Start Date: 27/Jul/21 06:27 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3235: URL: https://github.com/apache/hadoop/pull/3235#issuecomment-887246943 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 9s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 31m 52s | | trunk passed | | +1 :green_heart: | compile | 1m 32s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 21s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 3s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 32s | | trunk passed | | +1 :green_heart: | javadoc | 0m 59s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 31s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 19s | | trunk passed | | +1 :green_heart: | shadedclient | 16m 32s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 18s | | the patch passed | | +1 :green_heart: | compile | 1m 14s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 14s | | the patch passed | | +1 :green_heart: | compile | 1m 9s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 9s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 48s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 17s | | the patch passed | | +1 :green_heart: | javadoc | 0m 47s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 22s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 15s | | the patch passed | | +1 :green_heart: | shadedclient | 15m 52s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 508m 51s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 46s | | The patch does not generate ASF License warnings. | | | | 595m 14s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.TestViewDistributedFileSystemContract | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeWithHdfsScheme | | | hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeHdfsFileSystemContract | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS | | | hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand | | | hadoop.hdfs.TestHDFSFileSystemContract | | | hadoop.hdfs.web.TestWebHdfsFileSystemContract | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3235 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | |
[jira] [Work logged] (HDFS-16143) TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
[ https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=628180=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628180 ] ASF GitHub Bot logged work on HDFS-16143: - Author: ASF GitHub Bot Created on: 27/Jul/21 06:23 Start Date: 27/Jul/21 06:23 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3235: URL: https://github.com/apache/hadoop/pull/3235#issuecomment-887245255 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 28s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 31m 52s | | trunk passed | | +1 :green_heart: | compile | 1m 33s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 21s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 3s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 31s | | trunk passed | | +1 :green_heart: | javadoc | 1m 1s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 29s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 21s | | trunk passed | | +1 :green_heart: | shadedclient | 16m 12s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 19s | | the patch passed | | +1 :green_heart: | compile | 1m 20s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 20s | | the patch passed | | +1 :green_heart: | compile | 1m 9s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 9s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 49s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 17s | | the patch passed | | +1 :green_heart: | javadoc | 0m 50s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 22s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 14s | | the patch passed | | +1 :green_heart: | shadedclient | 16m 7s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 505m 25s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 48s | | The patch does not generate ASF License warnings. | | | | 592m 10s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.TestViewDistributedFileSystemContract | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.hdfs.TestLeaseRecovery | | | hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeWithHdfsScheme | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS | | | hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand | | | hadoop.hdfs.TestHDFSFileSystemContract | | | hadoop.hdfs.web.TestWebHdfsFileSystemContract | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3235/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3235 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux f75047459ee6 4.15.0-65-generic
[jira] [Work logged] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff
[ https://issues.apache.org/jira/browse/HDFS-16145?focusedWorklogId=628177=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628177 ] ASF GitHub Bot logged work on HDFS-16145: - Author: ASF GitHub Bot Created on: 27/Jul/21 06:18 Start Date: 27/Jul/21 06:18 Worklog Time Spent: 10m Work Description: szetszwo commented on a change in pull request #3234: URL: https://github.com/apache/hadoop/pull/3234#discussion_r677149758 ## File path: hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java ## @@ -59,6 +60,9 @@ // private EnumMap> diffMap; private DiffInfo[] renameDiffs; + // entries which are marked deleted because of rename to a excluded target + // path + private DiffInfo[] deletedByExclusionDiffs; Review comment: Let's use List. Then we don't have to convert to an array in the code below. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628177) Time Spent: 2h 50m (was: 2h 40m) > CopyListing fails with FNF exception with snapshot diff > --- > > Key: HDFS-16145 > URL: https://issues.apache.org/jira/browse/HDFS-16145 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > Distcp with snapshotdiff and with filters, marks a Rename as a delete > opeartion on the target if the rename target is to a directory which is > exluded by the filter. But, in cases, where files/subdirs created/modified > prior to the Rename post the old snapshot will still be present as > modified/created entries in the final copy list. Since, the parent diretory > is marked for deletion, these subsequent create/modify entries should be > ignored while building the final copy list. > With such cases, when the final copy list is built, distcp tries to do a > lookup for each create/modified file in the newer snapshot which will fail > as, the parent dir is already moved to a new location in later snapshot. > > {code:java} > sudo -u kms hadoop key create testkey > hadoop fs -mkdir -p /data/gcgdlknnasg/ > hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/ > hadoop fs -mkdir -p /dest/gcgdlknnasg > hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg > hdfs dfs -mkdir /data/gcgdlknnasg/dir1 > hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ > hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/ > drwxrwxrwt - hdfs supergroup 0 2021-07-16 14:05 > /data/gcgdlknnasg/.Trash > drwxr-xr-x - hdfs supergroup 0 2021-07-16 13:07 > /data/gcgdlknnasg/dir1 > [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# > hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/ > hdfs dfs -rm -r /data/gcgdlknnasg/dir1/ > hdfs dfs -mkdir /data/gcgdlknnasg/dir1/ > ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the > replication schedule. You get into below error and failure of the BDR job. > 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - > java.io.FileNotFoundException: File does not exist: > /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487) > …….. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning
[ https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387797#comment-17387797 ] Xing Lin edited comment on HDFS-14703 at 7/27/21, 6:07 AM: --- [~daryn] Thanks for your comments. I will address your last question and leave other questions to [~shv]. :) Regarding the results, we used the standard NNThroughputBenchmark, with commands like the following. {code:java} ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs file:/// -op mkdirs -threads 200 -dirs 1000 -dirsPerDir 512{code} Here are a result from [~prasad-acit], since his QPS numbers are higher than what I got. {code:java} BASE: common/hadoop-hdfs-32021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: — mkdirs inputs — 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: nrDirs = 100 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: nrThreads = 200 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 32 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: — mkdirs stats — 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: # operations: 100 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: Elapsed Time: 17718 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: Ops per sec: 56439.77875606727 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: Average Time: 3 2021-05-17 11:17:36,973 INFO namenode.FSEditLog: Ending log segment 1, 1031254 PATCH: 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: — mkdirs inputs — 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: nrDirs = 100 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: nrThreads = 200 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 32 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: — mkdirs stats — 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: # operations: 100 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: Elapsed Time: 15010 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: Ops per sec: 66622.25183211193 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: Average Time: 2 2021-05-17 11:11:09,331 INFO namenode.FSEditLog: Ending log segment 1, 1031254 {code} was (Author: xinglin): [~daryn] Thanks for your comments. I will address your last question and leave other questions to [~shv]. :) Regarding the results, we used the standard NNThroughputBenchmark, with commands like the following. ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark *-fs* [*file:///*|file:///*] -op mkdirs -threads 200 -dirs 1000 -dirsPerDir 512 Here are a result from [~prasad-acit], since his QPS numbers are higher than what I got. BASE: common/hadoop-hdfs-32021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs --- 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: nrDirs = 100 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: nrThreads = 200 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 32 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: --- mkdirs stats --- 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: # operations: 100 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: Elapsed Time: 17718 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: Ops per sec: 56439.77875606727 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: Average Time: 3 2021-05-17 11:17:36,973 INFO namenode.FSEditLog: Ending log segment 1, 1031254 PATCH: 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs --- 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: nrDirs = 100 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: nrThreads = 200 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 32 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: --- mkdirs stats --- 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: # operations: 100 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: Elapsed Time: 15010 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: Ops per sec: 66622.25183211193 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: Average Time: 2 2021-05-17 11:11:09,331 INFO namenode.FSEditLog: Ending log segment 1, 1031254 > NameNode Fine-Grained Locking via Metadata Partitioning > --- > > Key: HDFS-14703 > URL: https://issues.apache.org/jira/browse/HDFS-14703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Konstantin Shvachko >
[jira] [Work logged] (HDFS-16111) Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes at datanodes.
[ https://issues.apache.org/jira/browse/HDFS-16111?focusedWorklogId=628175=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-628175 ] ASF GitHub Bot logged work on HDFS-16111: - Author: ASF GitHub Bot Created on: 27/Jul/21 06:07 Start Date: 27/Jul/21 06:07 Worklog Time Spent: 10m Work Description: zhihaixu2012 commented on pull request #3175: URL: https://github.com/apache/hadoop/pull/3175#issuecomment-887237138 Updated the patch to fix the above test failures. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 628175) Time Spent: 2h 20m (was: 2h 10m) > Add a configuration to RoundRobinVolumeChoosingPolicy to avoid failed volumes > at datanodes. > --- > > Key: HDFS-16111 > URL: https://issues.apache.org/jira/browse/HDFS-16111 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Zhihai Xu >Assignee: Zhihai Xu >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > When we upgraded our hadoop cluster from hadoop 2.6.0 to hadoop 3.2.2, we got > failed volume on a lot of datanodes, which cause some missing blocks at that > time. Although later on we recovered all the missing blocks by symlinking the > path (dfs/dn/current) on the failed volume to a new directory and copying all > the data to the new directory, we missed our SLA and it delayed our upgrading > process on our production cluster for several hours. > When this issue happened, we saw a lot of this exceptions happened before the > volumed failed on the datanode: > [DataXceiver for client at /[XX.XX.XX.XX:XXX|http://10.104.103.159:33986/] > [Receiving block BP-XX-XX.XX.XX.XX-XX:blk_X_XXX]] > datanode.DataNode (BlockReceiver.java:(289)) - IOException in > BlockReceiver constructor :Possible disk error: Failed to create > /XXX/dfs/dn/current/BP-XX-XX.XX.XX.XX-X/tmp/blk_XX. Cause > is > java.io.IOException: No space left on device > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1012) > at > org.apache.hadoop.hdfs.server.datanode.FileIoProvider.createFile(FileIoProvider.java:302) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createFileWithExistsCheck(DatanodeUtil.java:69) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createTmpFile(BlockPoolSlice.java:292) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createTmpFile(FsVolumeImpl.java:532) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createTemporary(FsVolumeImpl.java:1254) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1598) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:212) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.getBlockReceiver(DataXceiver.java:1314) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:768) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:291) > at java.lang.Thread.run(Thread.java:748) > > We found this issue happened due to the following two reasons: > First the upgrade process added some extra disk storage on the each disk > volume of the data node: > BlockPoolSliceStorage.doUpgrade > (https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java#L445) > is the main upgrade function in the datanode, it will add some extra > storage. The extra storage added is all new directories created in > /current//current, although all block data file and block meta data > file are hard-linked with /current//previous after upgrade. Since there > will be a lot of new directories created, this will use some disk space on > each disk volume. > > Second there is a potential bug when picking a disk volume to write a new > block file(replica). By default, Hadoop uses
[jira] [Commented] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning
[ https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387797#comment-17387797 ] Xing Lin commented on HDFS-14703: - [~daryn] Thanks for your comments. I will address your last question and leave other questions to [~shv]. :) Regarding the results, we used the standard NNThroughputBenchmark, with commands like the following. ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark *-fs* [*file:///*|file:///*] -op mkdirs -threads 200 -dirs 1000 -dirsPerDir 512 Here are a result from [~prasad-acit], since his QPS numbers are higher than what I got. BASE: common/hadoop-hdfs-32021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs --- 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: nrDirs = 100 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: nrThreads = 200 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 32 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: --- mkdirs stats --- 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: # operations: 100 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: Elapsed Time: 17718 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: Ops per sec: 56439.77875606727 2021-05-17 11:17:36,973 INFO namenode.NNThroughputBenchmark: Average Time: 3 2021-05-17 11:17:36,973 INFO namenode.FSEditLog: Ending log segment 1, 1031254 PATCH: 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs --- 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: nrDirs = 100 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: nrThreads = 200 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 32 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: --- mkdirs stats --- 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: # operations: 100 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: Elapsed Time: 15010 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: Ops per sec: 66622.25183211193 2021-05-17 11:11:09,321 INFO namenode.NNThroughputBenchmark: Average Time: 2 2021-05-17 11:11:09,331 INFO namenode.FSEditLog: Ending log segment 1, 1031254 > NameNode Fine-Grained Locking via Metadata Partitioning > --- > > Key: HDFS-14703 > URL: https://issues.apache.org/jira/browse/HDFS-14703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Priority: Major > Attachments: 001-partitioned-inodeMap-POC.tar.gz, > 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, > NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf > > > We target to enable fine-grained locking by splitting the in-memory namespace > into multiple partitions each having a separate lock. Intended to improve > performance of NameNode write operations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org