ArafatKhan2198 commented on code in PR #9243:
URL: https://github.com/apache/ozone/pull/9243#discussion_r2509201307
##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ContainerKeyMapperHelper.java:
##########
@@ -99,21 +110,38 @@ public static boolean reprocess(OMMetadataManager
omMetadataManager,
// Get the appropriate table based on BucketLayout
Table<String, OmKeyInfo> omKeyInfoTable =
omMetadataManager.getKeyTable(bucketLayout);
- // Iterate through the table and process keys
- try (TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>>
keyIter = omKeyInfoTable.iterator()) {
- while (keyIter.hasNext()) {
- Table.KeyValue<String, OmKeyInfo> kv = keyIter.next();
- handleKeyReprocess(kv.getKey(), kv.getValue(), containerKeyMap,
containerKeyCountMap,
- reconContainerMetadataManager);
- omKeyCount++;
-
- // Check and flush data if it reaches the batch threshold
- if (!checkAndCallFlushToDB(containerKeyMap,
containerKeyFlushToDBMaxThreshold,
- reconContainerMetadataManager)) {
- LOG.error("Failed to flush container key data for {}", taskName);
- return false;
+ ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
+ // Use parallel table iteration
+ Function<Table.KeyValue<String, OmKeyInfo>, Void> kvOperation = kv -> {
+ try {
+ try {
+ lock.readLock().lock();
+ handleKeyReprocess(kv.getKey(), kv.getValue(), containerKeyMap,
containerKeyCountMap,
+ reconContainerMetadataManager);
+ } finally {
+ lock.readLock().unlock();
+ }
+ omKeyCount.incrementAndGet();
+ if (containerKeyMap.size() >= containerKeyFlushToDBMaxThreshold) {
Review Comment:
Let me know if I get this correct -
Many threads see `size ≥ threshold` and pile up for the write lock. Only the
first one really needs to flush.
By the time a waiting thread acquires the write lock, the first thread may
have already flushed and cleared the map. The second thread now does a useless
check and releases hence it causes contention, duplicate attempts.
Is this correct understanding @devmadhuu ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]