zhongyujiang opened a new pull request, #6662: URL: https://github.com/apache/paimon/pull/6662
<!-- Please specify the module before the PR name: [core] ... or [flink] ... --> ### Purpose Currently, when using Spark to delete Paimon partitions, deletion fails or incorrectly removes mismatched data if the partition value is NULL or an empty string. This issue occurs across `ALTER TABLE ... DROP PARTITION`, `DELETE FROM`, and `TRUNCATE PARTITION` **ALTER TABLE T DROP PARTITION and DELETE FROM** Dropping a partition with partitionField = '' erroneously deletes data where partitionField = null. The root cause is that during deletion, Spark SQL passes partition values that are replaced by `InternalRowPartitionComputer` with the default partition value. However, the actual partition filter should use the original (real) partition values for matching, because the manifest files store the real partition values—not the default ones. This PR resolves the issue by preserving the original partition values (instead of replacing them with the default partition value) when computing partitions for data deletion: `InternalRowPartitionComputer preserveNullOrEmptyValue` **TRUNCATE PARTITION** Throws a NullPointerException (NPE) when the partition value is null. Fixed by safely handling null values. ### Tests Added in the PR. ### API and Format No. ### Documentation No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
