pratyakshsharma commented on code in PR #6662:
URL: https://github.com/apache/hudi/pull/6662#discussion_r972062851


##########
hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/HoodieSyncClient.java:
##########
@@ -83,18 +87,24 @@ public boolean isBootstrap() {
     return metaClient.getTableConfig().getBootstrapBasePath().isPresent();
   }
 
-  public boolean isDropPartition() {
+  /**
+   * Get the set of dropped partitions based on the latest commit metadata.
+   * Returns empty set if the latest commit was not due to DELETE_PARTITION 
operation.
+   */
+  public Set<String> getDroppedPartitions() {
     try {
-      Option<HoodieCommitMetadata> hoodieCommitMetadata = 
HoodieTableMetadataUtil.getLatestCommitMetadata(metaClient);
+      Option<HoodieCommitMetadata> hoodieCommitMetadata = 
getLatestCommitMetadata(metaClient);

Review Comment:
   This is still a problem I believe. Consider the scenario where 3 commits 
happen (without syncing to metastore) in order with action given below - 
   1. upsert
   2. drop_partition
   3. drop_partition
   
   We will miss the partitions dropped in commit 2 if we only see the latest 
commit metadata here. I guess we should check all the commit metadata since the 
last sync time with metastore and then get the dropped partitions. 
   
   Also it will be good to add a test case simulating this scenario so this 
remains intact in future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to