pratyakshsharma opened a new issue, #6578:
URL: https://github.com/apache/hudi/issues/6578

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? yes
   
   - Join the mailing list to engage in conversations and get faster support at 
[email protected].
   
   - If you have triaged this as a bug, then file an 
[issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   There might be a potential issue with syncPartitions flow in HiveSyncTool. 
Basically we get the isDropPartition based on the latest commit metadata 
[here](https://github.com/apache/hudi/blob/98c3d88b2f3635199f307c608e10015fffa0df73/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java#L203)
 and later on use this flag to delete all the partitions coming in the variable 
partitionStoragePartitions 
[here](https://github.com/apache/hudi/blob/98c3d88b2f3635199f307c608e10015fffa0df73/hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/HoodieSyncClient.java#L146-L148).
 Consider the scenario where the below actions are done in subsequent commits 
without syncing to hive metastore - 
   
   - upsert
   - drop_partition
   
   This case would result in dropping all the partitions affected in both the 
above commits which is not the desired behaviour.
   
   **Expected behavior**
   
   Partitions to drop should be picked by checking individual commit's metadata 
since last sync with metastore and not by checking only the latest commit 
metadata.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to