[GitHub] [ozone] ivandika3 commented on a diff in pull request #4696: HDDS-8580. Reduce memory usage in ContainerKeyMapperTask#reprocess

via GitHub Wed, 17 May 2023 07:39:44 -0700


ivandika3 commented on code in PR #4696:
URL: https://github.com/apache/ozone/pull/4696#discussion_r1196626237



##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ContainerKeyMapperTask.java:
##########
@@ -383,4 +443,64 @@ private void handlePutOMKeyEvent(String key, OmKeyInfo 
omKeyInfo,
     }
   }
 
+  /**
+   * Write an OM key to container DB and update containerID -> no. of keys
+   * count to the Global Stats table.
+   *
+   * @param key key String
+   * @param omKeyInfo omKeyInfo value
+   * @param containerKeyMap we keep the added containerKeys in this map
+   *                        to allow incremental batching to containerKeyTable
+   * @param containerKeyCountMap we keep the containerKey counts in this map 
+   *                             to allow batching to containerKeyCountTable
+   *                             after reprocessing is done
+   * @throws IOException if unable to write to recon DB.
+   */
+  private void handleKeyReprocess(String key,

Review Comment:
   Thank you for the review.
   > if it does not have it we just easily set the key count to zero
   
   If the container does not exist during the reprocess (when 
`handleKeyReprocess` is called), it won't be set in the `containerKeyCountMap` 
at all and therefore won't be persisted to the Recon container DB. I think the 
comment "keyCount will be 0 if containerID is not found" in 
`handleKeyReprocess` is a bit misleading since it was duplicated from 
`handlePutOMKeyEvent` where 
`ReconContainerMetadataManager#getKeyCountForContainer` will return 0 if the 
container is not there.  I will remove it to clear things up.
   
   > we don't ask from the reconContainerMetadataManager if the container 
exists, we rather ask if the containerKeyCountMap
   
   This is correct.
   
   To summarize, the major differences between  `handleKeyReprocess` compared 
to `handlePutOMKeyEvent` are:
   
   1. It doesn't need to keep track `deletedContainerKeyList` since reprocess 
will only insert (PUT) container info
   2. All the `(containerId -> key count)` related operation are done in-memory 
(no need to access `containerKeyCountTable`). It is only flushed after the 
whole reprocess is done.
   
   Hope this clarifies.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [ozone] ivandika3 commented on a diff in pull request #4696: HDDS-8580. Reduce memory usage in ContainerKeyMapperTask#reprocess

Reply via email to