errose28 commented on code in PR #7314:
URL: https://github.com/apache/ozone/pull/7314#discussion_r1844180851
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/AbstractKeyDeletingService.java:
##########
@@ -449,13 +453,21 @@ public long optimizeDirDeletesAndSubmitRequest(long
remainNum,
deletedDirsCount.addAndGet(dirNum + subdirDelNum);
movedDirsCount.addAndGet(subDirNum - subdirDelNum);
movedFilesCount.addAndGet(subFileNum);
+ long timeTakenInIteration = Time.monotonicNow() - startTime;
LOG.info("Number of dirs deleted: {}, Number of sub-dir " +
"deleted: {}, Number of sub-files moved:" +
" {} to DeletedTable, Number of sub-dirs moved {} to " +
- "DeletedDirectoryTable, iteration elapsed: {}ms," +
+ "DeletedDirectoryTable, limit per iteration: {}, iteration
elapsed: {}ms, " +
" totalRunCount: {}",
- dirNum, subdirDelNum, subFileNum, (subDirNum - subdirDelNum),
- Time.monotonicNow() - startTime, getRunCount());
+ dirNum, subdirDelNum, subFileNum, (subDirNum - subdirDelNum), limit,
+ timeTakenInIteration, getRunCount());
+ if (remainNum <= 0) {
+ LOG.warn("Limit for no. of directory+files that can be deleted in one
iteration is reached. " +
+ "Current limit: {} = {}",
+ OMConfigKeys.OZONE_PATH_DELETING_LIMIT_PER_TASK,
ozoneManager.getConfiguration()
+ .getInt(OMConfigKeys.OZONE_PATH_DELETING_LIMIT_PER_TASK,
+ OMConfigKeys.OZONE_PATH_DELETING_LIMIT_PER_TASK_DEFAULT));
+ }
Review Comment:
I guess the Jira description doesn't currently specify what do log when the
deletion limits are hit:
> At the end of each run, and INFO log should display how many items were
processed, what the current limit of items per run is, and how long it took.
If the run time of the iteration exceeded the configured interval, the
message should highlight this and be elevated to WARN.
This should also log the config keys that can remedy this for batch size,
interval, or number of threads.
However based on past experiences troubleshooting slow deletion issues in a
busy cluster with many other events going on I would be in favor of also
elevating this to warn if the limits are being hit to indicate that the system
is under load and deletion may be under-provisioned. However from the Jira this
should be one single message as a summary with all this information at the
corresponding log level based on its content.
Also worth noting that once the multi-threaded implementation is complete it
should take a lot more to hit the deletion limit, and this message will not
seem so spurious if 10 threads are being maxed out trying to process current
deletion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]