ArafatKhan2198 commented on code in PR #9243:
URL: https://github.com/apache/ozone/pull/9243#discussion_r2511024902
##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ContainerKeyMapperHelper.java:
##########
@@ -99,21 +110,38 @@ public static boolean reprocess(OMMetadataManager
omMetadataManager,
// Get the appropriate table based on BucketLayout
Table<String, OmKeyInfo> omKeyInfoTable =
omMetadataManager.getKeyTable(bucketLayout);
- // Iterate through the table and process keys
- try (TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
keyIter = omKeyInfoTable.iterator()) {
- while (keyIter.hasNext()) {
- Table.KeyValue<String, OmKeyInfo> kv = keyIter.next();
- handleKeyReprocess(kv.getKey(), kv.getValue(), containerKeyMap,
containerKeyCountMap,
- reconContainerMetadataManager);
- omKeyCount++;
-
- // Check and flush data if it reaches the batch threshold
- if (!checkAndCallFlushToDB(containerKeyMap,
containerKeyFlushToDBMaxThreshold,
- reconContainerMetadataManager)) {
- LOG.error("Failed to flush container key data for {}", taskName);
- return false;
+ ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
Review Comment:
**Evaluated but decided against due to performance impact.** We initially
implemented fair locking `(new ReentrantReadWriteLock(true))`, but observed
**60%+ performance degradation** in testing (throughput dropped from ~60K
keys/sec to ~7K keys/sec).
**Alternative mitigation implemented:**
```
AtomicBoolean isFlushingInProgress = new AtomicBoolean(false);
if (map.size() >= threshold && isFlushingInProgress.compareAndSet(false,
true)) {
// Only ONE thread attempts flush
}
```
This flag-based coordination prevents the mass queueing scenario while
maintaining performance. In our 8.7M key workload, flushing happens ~58 times
total (every 150K keys), making starvation statistically unlikely. The
performance trade-off didn't justify fair locking. Open to revisiting if
starvation is observed in production.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]