zhongyujiang commented on PR #6662:
URL: https://github.com/apache/paimon/pull/6662#issuecomment-3569384180
@JingsongLi Thanks for looking at this.
I think this issue occurs for all catalogs. The root cause lies in Spark’s
current partition-dropping logic, which incorrectly transforms partition values
using InternalRowPartitionComputer—not in the catalog implementation itself.
Here's how the issue unfolds:
When executing ALTER TABLE T DROP PARTITION(pt=''),
PaimonPartitionManagement#toPaimonPartitions uses InternalRowPartitionComputer
to convert the Spark Row into a partition spec of type Map<String, String>.
During this conversion, the empty string ('') is replaced with the default
partition value, resulting in (pt='__DEFAULT_PARTITION__').
This spec is then passed to FileStoreCommitImpl#dropPartitions to delete the
partition.
Inside dropPartitions, the partition spec (pt='__DEFAULT_PARTITION__') is
converted back into a Paimon internal row via
`InternalRowPartitionComputer#convertSpecToInternalRow` to construct the
partition predicate. However, at this stage, '__DEFAULT_PARTITION__' is
converted to null (see code below):
https://github.com/apache/paimon/blob/4a7cdf445f0a726266ecb3dd36b36c823c79e615/paimon-common/src/main/java/org/apache/paimon/utils/InternalRowPartitionComputer.java#L135-L153
As a result, data with pt = '' is not deleted; instead, data with pt = null
gets deleted (if it exists), leading to incorrect behavior.
This also happens in `DELETE FROM`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]