zhongyujiang opened a new pull request, #6662:
URL: https://github.com/apache/paimon/pull/6662

   <!-- Please specify the module before the PR name: [core] ... or [flink] ... 
-->
   
   ### Purpose
   
   Currently, when using Spark to delete Paimon partitions, deletion fails or 
incorrectly removes mismatched data if the partition value is NULL or an empty 
string. This issue occurs across `ALTER TABLE ... DROP PARTITION`, `DELETE 
FROM`, and `TRUNCATE PARTITION`
   
   **ALTER TABLE T DROP PARTITION and DELETE FROM**
   Dropping a partition with partitionField = '' erroneously deletes data where 
partitionField = null.
   
   The root cause is that during deletion, Spark SQL passes partition values 
that are replaced by `InternalRowPartitionComputer` with the default partition 
value. However, the actual partition filter should use the original (real) 
partition values for matching, because the manifest files store the real 
partition values—not the default ones.
   
   This PR resolves the issue by preserving the original partition values 
(instead of replacing them with the default partition value) when computing 
partitions for data deletion: `InternalRowPartitionComputer 
preserveNullOrEmptyValue`
   
   
   
   **TRUNCATE PARTITION**
   Throws a NullPointerException (NPE) when the partition value is null.
   
   Fixed by safely handling null values.
   
   
   ### Tests
   
   Added in the PR.
   
   ### API and Format
   
   No.
   
   ### Documentation
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to