[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802619#comment-17802619 ] Shilun Fan commented on HDFS-16000: --- Bulk update: moved all 3.4.0 non-blocker issues, please move back if it is a blocker. Retarget 3.5.0. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800745#comment-17800745 ] ASF GitHub Bot commented on HDFS-16000: --- lfxy commented on code in PR #2964: URL: https://github.com/apache/hadoop/pull/2964#discussion_r1436951488 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java: ## @@ -470,17 +475,53 @@ static RenameResult unprotectedRenameTo(FSDirectory fsd, } } finally { if (undoRemoveSrc) { -tx.restoreSource(); +tx.restoreSource(srcStoragePolicyCounts); } if (undoRemoveDst) { // Rename failed - restore dst -tx.restoreDst(bsps); +tx.restoreDst(bsps, dstStoragePolicyCounts); } } NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + "failed to rename " + src + " to " + dst); throw new IOException("rename from " + src + " to " + dst + " failed."); } + /* + * Calculate QuotaCounts based on parent directory and storage policy + * 1. If the storage policy of src and dst are different, + * calculate the QuotaCounts of src and dst respectively. + * 2. If all parent nodes of src and dst are not set with Quota, + * there is no need to calculate QuotaCount. + * 3. if parent nodes of src and dst have Quota configured, + * the QuotaCount is calculated once using the storage policy of src. + * */ + private static void computeQuotaCounts( + QuotaCounts srcStoragePolicyCounts, + QuotaCounts dstStoragePolicyCounts, + INodesInPath srcIIP, + INodesInPath dstIIP, + BlockStoragePolicySuite bsps, + RenameOperation tx) { +INode dstParent = dstIIP.getINode(-2); +INode srcParentNode = FSDirectory. +getFirstSetQuotaParentNode(srcIIP); +INode srcInode = srcIIP.getLastINode(); +INode dstParentNode = FSDirectory. +getFirstSetQuotaParentNode(dstIIP); +byte srcStoragePolicyID = FSDirectory.getStoragePolicyId(srcInode); +byte dstStoragePolicyID = FSDirectory.getStoragePolicyId(dstParent); +if (srcStoragePolicyID != dstStoragePolicyID) { + srcStoragePolicyCounts.add(srcIIP.getLastINode(). + computeQuotaUsage(bsps)); + dstStoragePolicyCounts.add(srcIIP.getLastINode() + .computeQuotaUsage(bsps, dstParent.getStoragePolicyID(), false, + Snapshot.CURRENT_STATE_ID)); +} else if (srcParentNode != dstParentNode || tx.withCount != null) { + srcStoragePolicyCounts.add(srcIIP.getLastINode().computeQuotaUsage(bsps)); + dstStoragePolicyCounts.add(srcStoragePolicyCounts); +} Review Comment: @Hexiaoqiao @zhuxiangyi Our production cluster encountered performance issues related to rename and needed to be optimized. In (srcStoragePolicyID == dstStoragePolicyID) condition, do we need to compute only when both conditions srcParentNode != dstParentNode and tx.isSrcInSnapshot == false are met? If it is right, can we use the below logics: if (srcStoragePolicyID != dstStoragePolicyID) { compute; } else if (srcParentNode != dstParentNode && !tx.isSrcInSnapshot) { compute; } Looking forward to your reply, thanks. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773296#comment-17773296 ] ASF GitHub Bot commented on HDFS-16000: --- Hexiaoqiao commented on PR #2964: URL: https://github.com/apache/hadoop/pull/2964#issuecomment-1752982944 Great! will wait for this feature to be ready! > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773105#comment-17773105 ] ASF GitHub Bot commented on HDFS-16000: --- zhuxiangyi commented on PR #2964: URL: https://github.com/apache/hadoop/pull/2964#issuecomment-1752373446 > @zhuxiangyi Hi, do you still work on this? Thanks. Hi , I have taken a long vacation and apologize for not replying to you in a timely manner. I am still doing this. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773091#comment-17773091 ] ASF GitHub Bot commented on HDFS-16000: --- Hexiaoqiao commented on PR #2964: URL: https://github.com/apache/hadoop/pull/2964#issuecomment-1752298163 @zhuxiangyi Hi, do you still work on this? Thanks. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766304#comment-17766304 ] ASF GitHub Bot commented on HDFS-16000: --- zhuxiangyi commented on PR #2964: URL: https://github.com/apache/hadoop/pull/2964#issuecomment-1723082950 > Thanks @zhuxiangyi for your works. It is great idea and improvement. Almost LGTM. Leave some comments inline. Will give my +1 once correct. Thanks. @Hexiaoqiao Thank you very much for your reivew. I have fixed the problem and resubmitted the code. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766299#comment-17766299 ] ASF GitHub Bot commented on HDFS-16000: --- zhuxiangyi commented on code in PR #2964: URL: https://github.com/apache/hadoop/pull/2964#discussion_r1328469427 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java: ## @@ -470,17 +475,53 @@ static RenameResult unprotectedRenameTo(FSDirectory fsd, } } finally { if (undoRemoveSrc) { -tx.restoreSource(); +tx.restoreSource(srcStoragePolicyCounts); } if (undoRemoveDst) { // Rename failed - restore dst -tx.restoreDst(bsps); +tx.restoreDst(bsps, dstStoragePolicyCounts); } } NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + "failed to rename " + src + " to " + dst); throw new IOException("rename from " + src + " to " + dst + " failed."); } + /* + * Calculate QuotaCounts based on parent directory and storage policy + * 1. If the storage policy of src and dst are different, + * calculate the QuotaCounts of src and dst respectively. + * 2. If all parent nodes of src and dst are not set with Quota, + * there is no need to calculate QuotaCount. + * 3. if parent nodes of src and dst have Quota configured, + * the QuotaCount is calculated once using the storage policy of src. + * */ + private static void computeQuotaCounts( + QuotaCounts srcStoragePolicyCounts, + QuotaCounts dstStoragePolicyCounts, + INodesInPath srcIIP, + INodesInPath dstIIP, + BlockStoragePolicySuite bsps, + RenameOperation tx) { +INode dstParent = dstIIP.getINode(-2); +INode srcParentNode = FSDirectory. +getFirstSetQuotaParentNode(srcIIP); +INode srcInode = srcIIP.getLastINode(); +INode dstParentNode = FSDirectory. +getFirstSetQuotaParentNode(dstIIP); +byte srcStoragePolicyID = FSDirectory.getStoragePolicyId(srcInode); +byte dstStoragePolicyID = FSDirectory.getStoragePolicyId(dstParent); +if (srcStoragePolicyID != dstStoragePolicyID) { + srcStoragePolicyCounts.add(srcIIP.getLastINode(). + computeQuotaUsage(bsps)); + dstStoragePolicyCounts.add(srcIIP.getLastINode() + .computeQuotaUsage(bsps, dstParent.getStoragePolicyID(), false, + Snapshot.CURRENT_STATE_ID)); +} else if (srcParentNode != dstParentNode || tx.withCount != null) { + srcStoragePolicyCounts.add(srcIIP.getLastINode().computeQuotaUsage(bsps)); + dstStoragePolicyCounts.add(srcStoragePolicyCounts); +} Review Comment: If this is the case, it can be understood that src and dst have a configured quota, and src is the isSrcInSnapshot attribute. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766298#comment-17766298 ] ASF GitHub Bot commented on HDFS-16000: --- zhuxiangyi commented on code in PR #2964: URL: https://github.com/apache/hadoop/pull/2964#discussion_r1328465474 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java: ## @@ -1468,6 +1475,30 @@ static Collection normalizePaths(Collection paths, return normalized; } + /** + * Get the first Node that sets Quota. + */ + static INode getFirstSetQuotaParentNode(INodesInPath iip) { +for (int i = iip.length() - 1; i > 0; i--) { + INode currNode = iip.getINode(i); + if (currNode == null) { Review Comment: There should be no expected > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766297#comment-17766297 ] ASF GitHub Bot commented on HDFS-16000: --- zhuxiangyi commented on code in PR #2964: URL: https://github.com/apache/hadoop/pull/2964#discussion_r1328464878 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java: ## @@ -1468,6 +1475,30 @@ static Collection normalizePaths(Collection paths, return normalized; } + /** + * Get the first Node that sets Quota. + */ + static INode getFirstSetQuotaParentNode(INodesInPath iip) { +for (int i = iip.length() - 1; i > 0; i--) { + INode currNode = iip.getINode(i); + if (currNode == null) { Review Comment: Here we traverse from the last INode to the penultimate node, excluding the root node. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766295#comment-17766295 ] ASF GitHub Bot commented on HDFS-16000: --- zhuxiangyi commented on code in PR #2964: URL: https://github.com/apache/hadoop/pull/2964#discussion_r1328454449 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java: ## @@ -470,17 +475,53 @@ static RenameResult unprotectedRenameTo(FSDirectory fsd, } } finally { if (undoRemoveSrc) { -tx.restoreSource(); +tx.restoreSource(srcStoragePolicyCounts); } if (undoRemoveDst) { // Rename failed - restore dst -tx.restoreDst(bsps); +tx.restoreDst(bsps, dstStoragePolicyCounts); } } NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + "failed to rename " + src + " to " + dst); throw new IOException("rename from " + src + " to " + dst + " failed."); } + /* + * Calculate QuotaCounts based on parent directory and storage policy + * 1. If the storage policy of src and dst are different, + * calculate the QuotaCounts of src and dst respectively. + * 2. If all parent nodes of src and dst are not set with Quota, + * there is no need to calculate QuotaCount. + * 3. if parent nodes of src and dst have Quota configured, + * the QuotaCount is calculated once using the storage policy of src. + * */ + private static void computeQuotaCounts( + QuotaCounts srcStoragePolicyCounts, + QuotaCounts dstStoragePolicyCounts, + INodesInPath srcIIP, + INodesInPath dstIIP, + BlockStoragePolicySuite bsps, + RenameOperation tx) { +INode dstParent = dstIIP.getINode(-2); +INode srcParentNode = FSDirectory. +getFirstSetQuotaParentNode(srcIIP); +INode srcInode = srcIIP.getLastINode(); +INode dstParentNode = FSDirectory. +getFirstSetQuotaParentNode(dstIIP); +byte srcStoragePolicyID = FSDirectory.getStoragePolicyId(srcInode); +byte dstStoragePolicyID = FSDirectory.getStoragePolicyId(dstParent); +if (srcStoragePolicyID != dstStoragePolicyID) { + srcStoragePolicyCounts.add(srcIIP.getLastINode(). + computeQuotaUsage(bsps)); + dstStoragePolicyCounts.add(srcIIP.getLastINode() + .computeQuotaUsage(bsps, dstParent.getStoragePolicyID(), false, + Snapshot.CURRENT_STATE_ID)); +} else if (srcParentNode != dstParentNode || tx.withCount != null) { Review Comment: This is to determine whether the inode isSrcInSnapshot. If it is isSrcInSnapshot, we will calculate the quotaCount. I will change this to isSrcInSnapshot to determine. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766292#comment-17766292 ] ASF GitHub Bot commented on HDFS-16000: --- zhuxiangyi commented on code in PR #2964: URL: https://github.com/apache/hadoop/pull/2964#discussion_r1328445332 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java: ## @@ -470,17 +475,53 @@ static RenameResult unprotectedRenameTo(FSDirectory fsd, } } finally { if (undoRemoveSrc) { -tx.restoreSource(); +tx.restoreSource(srcStoragePolicyCounts); } if (undoRemoveDst) { // Rename failed - restore dst -tx.restoreDst(bsps); +tx.restoreDst(bsps, dstStoragePolicyCounts); } } NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + "failed to rename " + src + " to " + dst); throw new IOException("rename from " + src + " to " + dst + " failed."); } + /* + * Calculate QuotaCounts based on parent directory and storage policy + * 1. If the storage policy of src and dst are different, + * calculate the QuotaCounts of src and dst respectively. + * 2. If all parent nodes of src and dst are not set with Quota, + * there is no need to calculate QuotaCount. + * 3. if parent nodes of src and dst have Quota configured, + * the QuotaCount is calculated once using the storage policy of src. + * */ + private static void computeQuotaCounts( + QuotaCounts srcStoragePolicyCounts, + QuotaCounts dstStoragePolicyCounts, + INodesInPath srcIIP, + INodesInPath dstIIP, + BlockStoragePolicySuite bsps, + RenameOperation tx) { +INode dstParent = dstIIP.getINode(-2); +INode srcParentNode = FSDirectory. +getFirstSetQuotaParentNode(srcIIP); +INode srcInode = srcIIP.getLastINode(); +INode dstParentNode = FSDirectory. +getFirstSetQuotaParentNode(dstIIP); +byte srcStoragePolicyID = FSDirectory.getStoragePolicyId(srcInode); +byte dstStoragePolicyID = FSDirectory.getStoragePolicyId(dstParent); +if (srcStoragePolicyID != dstStoragePolicyID) { + srcStoragePolicyCounts.add(srcIIP.getLastINode(). + computeQuotaUsage(bsps)); + dstStoragePolicyCounts.add(srcIIP.getLastINode() Review Comment: Thanks for finding this problem. If the Inode sets the StoragePolicy we should use the StoragePolicy calculation of the Inode. I will fix it. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766194#comment-17766194 ] ASF GitHub Bot commented on HDFS-16000: --- Hexiaoqiao commented on code in PR #2964: URL: https://github.com/apache/hadoop/pull/2964#discussion_r1328244599 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java: ## @@ -1468,6 +1475,30 @@ static Collection normalizePaths(Collection paths, return normalized; } + /** + * Get the first Node that sets Quota. + */ + static INode getFirstSetQuotaParentNode(INodesInPath iip) { +for (int i = iip.length() - 1; i > 0; i--) { + INode currNode = iip.getINode(i); + if (currNode == null) { Review Comment: Will it meet null here? if it is not expected, we should throw exception IMO. ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java: ## @@ -470,17 +475,53 @@ static RenameResult unprotectedRenameTo(FSDirectory fsd, } } finally { if (undoRemoveSrc) { -tx.restoreSource(); +tx.restoreSource(srcStoragePolicyCounts); } if (undoRemoveDst) { // Rename failed - restore dst -tx.restoreDst(bsps); +tx.restoreDst(bsps, dstStoragePolicyCounts); } } NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + "failed to rename " + src + " to " + dst); throw new IOException("rename from " + src + " to " + dst + " failed."); } + /* + * Calculate QuotaCounts based on parent directory and storage policy + * 1. If the storage policy of src and dst are different, + * calculate the QuotaCounts of src and dst respectively. + * 2. If all parent nodes of src and dst are not set with Quota, + * there is no need to calculate QuotaCount. + * 3. if parent nodes of src and dst have Quota configured, + * the QuotaCount is calculated once using the storage policy of src. + * */ + private static void computeQuotaCounts( + QuotaCounts srcStoragePolicyCounts, + QuotaCounts dstStoragePolicyCounts, + INodesInPath srcIIP, + INodesInPath dstIIP, + BlockStoragePolicySuite bsps, + RenameOperation tx) { +INode dstParent = dstIIP.getINode(-2); +INode srcParentNode = FSDirectory. +getFirstSetQuotaParentNode(srcIIP); +INode srcInode = srcIIP.getLastINode(); +INode dstParentNode = FSDirectory. +getFirstSetQuotaParentNode(dstIIP); +byte srcStoragePolicyID = FSDirectory.getStoragePolicyId(srcInode); +byte dstStoragePolicyID = FSDirectory.getStoragePolicyId(dstParent); +if (srcStoragePolicyID != dstStoragePolicyID) { + srcStoragePolicyCounts.add(srcIIP.getLastINode(). + computeQuotaUsage(bsps)); + dstStoragePolicyCounts.add(srcIIP.getLastINode() Review Comment: IIUC, this result will be used for the next verify and storage used addition/subtraction for src and dst inode, right? But I am confused if it will meet some issues here, given directory /a/b (whose storage policy is HDD), /c/d (whose storage policy is SSD), when rename from /a/b/r1 (let's assume 1GB used) to /c/d/r2, then total HDD storage used will decrease 1GB, and SSD storage used increase 1GB, this will be different with fact? ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java: ## @@ -470,17 +475,53 @@ static RenameResult unprotectedRenameTo(FSDirectory fsd, } } finally { if (undoRemoveSrc) { -tx.restoreSource(); +tx.restoreSource(srcStoragePolicyCounts); } if (undoRemoveDst) { // Rename failed - restore dst -tx.restoreDst(bsps); +tx.restoreDst(bsps, dstStoragePolicyCounts); } } NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + "failed to rename " + src + " to " + dst); throw new IOException("rename from " + src + " to " + dst + " failed."); } + /* + * Calculate QuotaCounts based on parent directory and storage policy + * 1. If the storage policy of src and dst are different, + * calculate the QuotaCounts of src and dst respectively. + * 2. If all parent nodes of src and dst are not set with Quota, + * there is no need to calculate QuotaCount. + * 3. if parent nodes of src and dst have Quota configured, + * the QuotaCount is calculated once using the storage policy of src. + * */ + private static void computeQuotaCounts( + QuotaCounts srcStoragePolicyCounts, + QuotaCounts dstStoragePolicyCounts, + INodesInPath srcIIP, + INodesInPath dstIIP, + BlockStoragePolicySuite bsps, + RenameOperation tx) { +INode dstParent = dstIIP.getINode(-2); +INode
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764470#comment-17764470 ] ASF GitHub Bot commented on HDFS-16000: --- Hexiaoqiao commented on PR #2964: URL: https://github.com/apache/hadoop/pull/2964#issuecomment-1716891085 @zhuxiangyi Thanks for your contributions. Will review this week. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764469#comment-17764469 ] ASF GitHub Bot commented on HDFS-16000: --- zhuxiangyi commented on PR #2964: URL: https://github.com/apache/hadoop/pull/2964#issuecomment-1716889059 @Hexiaoqiao @jojochuang Can you review it for me > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 50m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764242#comment-17764242 ] ASF GitHub Bot commented on HDFS-16000: --- hadoop-yetus commented on PR #2964: URL: https://github.com/apache/hadoop/pull/2964#issuecomment-1715903240 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 32s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 19s | | trunk passed | | +1 :green_heart: | compile | 0m 57s | | trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | compile | 0m 50s | | trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | checkstyle | 0m 45s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 56s | | trunk passed | | +1 :green_heart: | javadoc | 0m 48s | | trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 10s | | trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | spotbugs | 1m 59s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 3s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 46s | | the patch passed | | +1 :green_heart: | compile | 0m 46s | | the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javac | 0m 46s | | the patch passed | | +1 :green_heart: | compile | 0m 43s | | the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | javac | 0m 43s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 35s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 46s | | the patch passed | | +1 :green_heart: | javadoc | 0m 38s | | the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 7s | | the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | spotbugs | 1m 50s | | the patch passed | | +1 :green_heart: | shadedclient | 21m 42s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 194m 52s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 39s | | The patch does not generate ASF License warnings. | | | | 288m 5s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2964/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2964 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 0f8c921cbb81 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / b4ac1fa18a1bac8fc5b31bd7eb46093b1315c8ff | | Default Java | Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2964/6/testReport/ | | Max. process+thread count | 3576 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2964/6/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > HDFS : Rename performance optimization >
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763839#comment-17763839 ] ASF GitHub Bot commented on HDFS-16000: --- hadoop-yetus commented on PR #2964: URL: https://github.com/apache/hadoop/pull/2964#issuecomment-1714295665 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 28s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 28s | | trunk passed | | +1 :green_heart: | compile | 0m 55s | | trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | compile | 0m 49s | | trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | checkstyle | 0m 44s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 56s | | trunk passed | | +1 :green_heart: | javadoc | 0m 50s | | trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 18s | | trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | spotbugs | 2m 2s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 28s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 46s | | the patch passed | | +1 :green_heart: | compile | 0m 46s | | the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javac | 0m 46s | | the patch passed | | +1 :green_heart: | compile | 0m 42s | | the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | javac | 0m 42s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 35s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 49s | | the patch passed | | +1 :green_heart: | javadoc | 0m 38s | | the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 10s | | the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | spotbugs | 1m 52s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 8s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 195m 49s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2964/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 39s | | The patch does not generate ASF License warnings. | | | | 290m 23s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestSnapshotRename | | | hadoop.hdfs.server.namenode.TestFSImageWithSnapshot | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2964/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2964 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux bfbb4eff844e 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / b3ebd6c491dac676e6d68f4f72737557e492a49a | | Default Java | Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2964/5/testReport/ | | Max. process+thread count | 3402 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419086#comment-17419086 ] JiangHua Zhu commented on HDFS-16000: - Hello [~zhuxiangyi], I am also paying attention to this improvement item. Can you share some relevant test reports. thank you very much. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: Xiangyi Zhu >Assignee: Xiangyi Zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > h3. I think the following two points can optimize the efficiency of rename > execution > h3. QuotaCount calculation time-consuming optimization: > * Create a QuotaCounts object in the calculation directory quotaCount, and > pass the quotaCount to the next calculation function through a parameter each > time, so as to avoid creating an EnumCounters object for each calculation. > * In addition, through the flame graph, it is found that using lambda to > modify QuotaCounts takes longer than the ordinary method, so the ordinary > method is used to modify the QuotaCounts count. > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339562#comment-17339562 ] zhu commented on HDFS-16000: [~daryn] Thanks for your comments. Please review [rename performance optimization|https://github.com/apache/hadoop/pull/2964]. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > h3. I think the following two points can optimize the efficiency of rename > execution > h3. QuotaCount calculation time-consuming optimization: > * Create a QuotaCounts object in the calculation directory quotaCount, and > pass the quotaCount to the next calculation function through a parameter each > time, so as to avoid creating an EnumCounters object for each calculation. > * In addition, through the flame graph, it is found that using lambda to > modify QuotaCounts takes longer than the ordinary method, so the ordinary > method is used to modify the QuotaCounts count. > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337540#comment-17337540 ] Daryn Sharp commented on HDFS-16000: Nice. Quota optimization has been on my to-do list for years esp. since storage types made it much more expensive. Quota calculations have historically been buggy so this needs to be carefully reviewed. I'm currently consumed with finishing a 50-100X block placement performance optimization so please wait to commit until mid next week so I can hopefully carve out some time. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 10m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > h3. I think the following two points can optimize the efficiency of rename > execution > h3. QuotaCount calculation time-consuming optimization: > * Create a QuotaCounts object in the calculation directory quotaCount, and > pass the quotaCount to the next calculation function through a parameter each > time, so as to avoid creating an EnumCounters object for each calculation. > * In addition, through the flame graph, it is found that using lambda to > modify QuotaCounts takes longer than the ordinary method, so the ordinary > method is used to modify the QuotaCounts count. > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337014#comment-17337014 ] zhu commented on HDFS-16000: [~hexiaoqiao] Thank you for your comments and suggestions. This week I will solve these warns and add tests. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 10m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > h3. I think the following two points can optimize the efficiency of rename > execution > h3. QuotaCount calculation time-consuming optimization: > * Create a QuotaCounts object in the calculation directory quotaCount, and > pass the quotaCount to the next calculation function through a parameter each > time, so as to avoid creating an EnumCounters object for each calculation. > * In addition, through the flame graph, it is found that using lambda to > modify QuotaCounts takes longer than the ordinary method, so the ordinary > method is used to modify the QuotaCounts count. > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335450#comment-17335450 ] Xiaoqiao He commented on HDFS-16000: [~zhuxiangyi] Thanks for your report and contribution. It is good idea and improvement. BTW, just notice that different unit tests run failed and some checkstyle/javadoc warn. Would you mind to have another checks? Another side, it is enough to submit patch here or github only. No need to sumbit both side. Thanks again. > HDFS : Rename performance optimization > -- > > Key: HDFS-16000 > URL: https://issues.apache.org/jira/browse/HDFS-16000 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Affects Versions: 3.1.4, 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Labels: pull-request-available > Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, > HDFS-16000.patch > > Time Spent: 10m > Remaining Estimate: 0h > > It takes a long time to move a large directory with rename. For example, it > takes about 40 seconds to move a 1000W directory. When a large amount of data > is deleted to the trash, the move large directory will occur when the recycle > bin makes checkpoint. In addition, the user may also actively trigger the > move large directory operation, which will cause the NameNode to lock too > long and be killed by Zkfc. Through the flame graph, it is found that the > main time consuming is to create the EnumCounters object. > h3. I think the following two points can optimize the efficiency of rename > execution > h3. QuotaCount calculation time-consuming optimization: > * Create a QuotaCounts object in the calculation directory quotaCount, and > pass the quotaCount to the next calculation function through a parameter each > time, so as to avoid creating an EnumCounters object for each calculation. > * In addition, through the flame graph, it is found that using lambda to > modify QuotaCounts takes longer than the ordinary method, so the ordinary > method is used to modify the QuotaCounts count. > h3. Rename logic optimization: > * Regardless of whether the rename operation is the source directory and the > target directory, the quota count must be calculated three times. The first > time, check whether the moved directory exceeds the target directory quota, > the second time, calculate the mobile directory quota to update the source > directory quota, and the third time, calculate the mobile directory > configuration update to the target directory. > * I think some of the above three quota quota calculations are unnecessary. > For example, if all parent directories of the source directory and target > directory are not configured with quota, there is no need to calculate > quotaCount. Even if both the source directory and the target directory use > quota, there is no need to calculate the quota three times. The calculation > logic for the first and third times is the same, and it only needs to be > calculated once. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16000) HDFS : Rename performance optimization
[ https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17334703#comment-17334703 ] Hadoop QA commented on HDFS-16000: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 21m 49s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 33m 6s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 0s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 23m 30s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 3m 16s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 41s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/592/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 46s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/592/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 46s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/592/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 44s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/592/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 44s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/592/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 1s{color} |