Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/21099 to look at the new patch set (#2). Change subject: IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations ...................................................................... IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations We must invoke validateDataFilesExist for RowDelta operations (DELETE/ UPDATE/MERGE). Without this a concurrent RewriteFiles (compaction) and RowDelta can corrupt a table. IcebergBufferedDeleteSink now also collects the filenames of the data files that are referenced in the position delete files. It adds them to the DML exec state which is then collected by the Coordinator. The Coordinator passes the file paths to CatalogD which executes Iceberg's RowDelta operation and now invokes validateDataFilesExist() with the file paths. Additionally it also invokes validateDeletedFiles(). This patch set also resolves IMPALA-12640 which is about replacing IcebergDeleteSink with IcebergBufferedDeleteSink, as from now on we use the buffered version for all DML operations that write position delete files. Testing: * adds new stress test with DELETE + UPDATE + OPTIMIZE Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c --- M be/src/exec/CMakeLists.txt M be/src/exec/iceberg-buffered-delete-sink.cc M be/src/exec/iceberg-buffered-delete-sink.h M be/src/exec/iceberg-delete-sink-config.cc D be/src/exec/iceberg-delete-sink.cc D be/src/exec/iceberg-delete-sink.h M be/src/exec/multi-table-sink.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/CatalogService.thrift M common/thrift/DataSinks.thrift M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/planner/IcebergBufferedDeleteSink.java D fe/src/main/java/org/apache/impala/planner/IcebergDeleteSink.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test M tests/stress/test_update_stress.py 20 files changed, 163 insertions(+), 580 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/21099/2 -- To view, visit http://gerrit.cloudera.org:8080/21099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c Gerrit-Change-Number: 21099 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>