zhongyujiang commented on PR #6662:
URL: https://github.com/apache/paimon/pull/6662#issuecomment-3570406650

   > if possible, it's best for us to maintain the status quo.
   
   I think that in Paimon's manifest, `partCol = null` and `partCol = ''` are 
actually treated as two distinct partitions, because their `_PARTITION` values 
in the manifest are different. Querying the files table or partitions table 
shows two separate partitions too, for both FileSystemCatalog and HiveCatalog.
   
   In other words, in Paimon's metadata, these are currently considered two 
different partitions. Just that data files of both types are stored under the 
same path `\_\_DEFAULT_PARTITION\_\_`.
   
   Are you saying that when using HiveCatalog, these two partitions get merged 
into a single partition when synced to the  HMS? And when doing partition 
deletion, would data from both null and empty string be deleted together? I’m 
not clear on this point.
   
   > Can this solve your business problem?
   
   The issue we’re encountering now is that data with partCol = '' cannot be 
deleted using either DELETE FROM or ALTER TABLE ... DROP PARTITION. Instead, it 
erroneously deletes data where partCol = null, which is not the expected 
behavior. So  I fixed the push down of partition values to ensure that 
truncatePartitions works correctly in this PR.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to