Zoltán Borók-Nagy created IMPALA-13087:
------------------------------------------

             Summary: DML operations on Iceberg tables should not write 
positition delete files for data files that are completely removed
                 Key: IMPALA-13087
                 URL: https://issues.apache.org/jira/browse/IMPALA-13087
             Project: IMPALA
          Issue Type: Bug
            Reporter: Zoltán Borók-Nagy


Users sometimes use the DELETE operation even in cases when they should use 
TRUNCATE or DROP PARTITION. In those cases the DELETE will write way too many 
position delete records that will hurt performance of subsequent queries. On 
top of that, these delete records are unnecessary, because we should just 
remove the corresponding data files from the new snapshot.

We need to smart up the DML operations to only write position delete records if 
they don't delete whole files. The IcebergBufferedDeleteSink has the 
FilePositions type: 
https://github.com/apache/impala/blob/bbfba13ed4d084681b542d7c5e1b5156576a603b/be/src/exec/iceberg-buffered-delete-sink.h#L66
It is a mapping from data files to the positions we are about to delete. After 
SortBufferedRecords() the positions are in order and there are no duplications. 
Therefore if the pos_vector.back() == pos_vector.size() - 1, we now we are 
about to delete a continuous range from 0 to N. At this point we need to look 
up the number of records in the corresponding data file, and if the number of 
records are N, we know we are about to delete a whole file.
In this case we shouldn't write the position delete records, but instead 
register the data file in dml_exec_state_ for deletion.
Then in the IcebergCatalogOpExecutor we should use Iceberg's DeleteFiles to 
remove the registered data files in the same Iceberg transaction with the 
RowDelta operation.

The DELETE statement is the most critical here, but UPDATE and MERGE might also 
benefit from this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to