zhongyujiang commented on PR #6662:
URL: https://github.com/apache/paimon/pull/6662#issuecomment-3569384180

   @JingsongLi Thanks for looking at this.
   
   I think this issue occurs for all catalogs. The root cause lies in Spark’s 
current partition-dropping logic, which incorrectly transforms partition values 
using InternalRowPartitionComputer—not in the catalog implementation itself. 
Here's how the issue unfolds:
   
   When executing ALTER TABLE T DROP PARTITION(pt=''), 
PaimonPartitionManagement#toPaimonPartitions uses InternalRowPartitionComputer 
to convert the Spark Row into a partition spec of type Map<String, String>. 
During this conversion, the empty string ('') is replaced with the default 
partition value, resulting in (pt='__DEFAULT_PARTITION__').
   
   This spec is then passed to FileStoreCommitImpl#dropPartitions to delete the 
partition.
   
   Inside dropPartitions, the partition spec (pt='__DEFAULT_PARTITION__') is 
converted back into a Paimon internal row via 
`InternalRowPartitionComputer#convertSpecToInternalRow` to construct the 
partition predicate. However, at this stage, '__DEFAULT_PARTITION__' is 
converted to null (see code below):
   
   
https://github.com/apache/paimon/blob/4a7cdf445f0a726266ecb3dd36b36c823c79e615/paimon-common/src/main/java/org/apache/paimon/utils/InternalRowPartitionComputer.java#L135-L153
   
   As a result, data with pt = '' is not deleted; instead, data with pt = null 
gets deleted (if it exists), leading to incorrect behavior.
   
   This also happens in `DELETE FROM`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to