devmadhuu commented on code in PR #8945:
URL: https://github.com/apache/ozone/pull/8945#discussion_r2280772305
##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/NSSummaryTask.java:
##########
@@ -189,14 +189,13 @@ public TaskResult process(
@Override
public TaskResult reprocess(OMMetadataManager omMetadataManager) {
// Unified control for all NSS tree rebuild operations
- RebuildState currentState = REBUILD_STATE.get();
- if (currentState == RebuildState.RUNNING) {
- LOG.info("NSSummary tree rebuild is already in progress, skipping
duplicate request.");
- return buildTaskResult(false);
- }
+ // Use a single atomic operation to prevent race conditions
+ RebuildState previousState = REBUILD_STATE.getAndSet(RebuildState.RUNNING);
- if (!REBUILD_STATE.compareAndSet(currentState, RebuildState.RUNNING)) {
- LOG.info("Failed to acquire rebuild lock, another thread may have
started rebuild.");
+ if (previousState == RebuildState.RUNNING) {
+ LOG.info("NSSummary tree rebuild is already in progress, skipping
duplicate request.");
+ // Restore the previous state since we didn't actually start
+ REBUILD_STATE.set(RebuildState.RUNNING);
Review Comment:
Thanks @adoroszlai for reviewing the code. Here are the flaky test run
results with master branch and with fixed code in HDDS-13573 branch.
master branch flaky test workflow run results:
https://github.com/devmadhuu/ozone/actions/runs/16983139469
HDDS-13573 branch flaky test workflow run results:
https://github.com/devmadhuu/ozone/actions/runs/17018339113
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]