hantangwangd opened a new issue, #15128:
URL: https://github.com/apache/iceberg/issues/15128

   ### Apache Iceberg version
   
   1.10.1 (latest release)
   
   ### Query engine
   
   None
   
   ### Please describe the bug 🐞
   
   Executing the following statements in Spark (on Iceberg) leads to a mismatch 
between the actual and expected query results:
   
   ```
   CREATE TABLE test_table (id bigint NOT NULL, data binary) USING iceberg 
PARTITIONED BY (data);
   INSERT INTO TABLE test_table VALUES(1, X'e3bcd1'), (2, X'bcd1');
   DELETE FROM test_table WHERE data = X'bcd1';
   SELECT * FROM %s where data = X'e3bcd1';
   ```
   
   The expected result is the remaining data row, but the query returns empty. 
Upon investigation, this is because the partition bounds for the binary column 
in the newly generated manifest file are computed incorrectly, causing the 
corresponding data file to be pruned during the planning phase.
   
   PrestoDB encounters the same issue when using 
`DeleteFiles.deleteFromRowFilter` to support file-level deletion.
   
   To dig deeper, the root cause is that, when calling 
`DeleteFiles.deleteFromRowFilter`, the `PartitionFieldStats`'s min/max fields 
directly reference a reusable byte array. Specifically, this array can be (and 
is) reused by the `ManifestReader` when processing multiple files.
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to