Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21099 )

Change subject: IMPALA-12860: Invoke validateDataFilesExist for RowDelta 
operations
......................................................................


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/21099/1/be/src/runtime/dml-exec-state.h
File be/src/runtime/dml-exec-state.h:

http://gerrit.cloudera.org:8080/#/c/21099/1/be/src/runtime/dml-exec-state.h@135
PS1, Line 135:   // Returns vector of Iceberg data files referenced by position 
delete records by
> since it returns const ref, can the function also be const?
Done


http://gerrit.cloudera.org:8080/#/c/21099/1/be/src/runtime/dml-exec-state.h@170
PS1, Line 170: _;
> nit: data_files_referenced_by_position_deletes
That's a better name indeed. Done.


http://gerrit.cloudera.org:8080/#/c/21099/1/common/protobuf/control_service.proto
File common/protobuf/control_service.proto:

http://gerrit.cloudera.org:8080/#/c/21099/1/common/protobuf/control_service.proto@115
PS1, Line 115: data_files_referenced_by_position_deletes
> nit: data_files_referenced_by_position_deletes?
Done


http://gerrit.cloudera.org:8080/#/c/21099/1/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/21099/1/common/thrift/CatalogService.thrift@249
PS1, Line 249: data_files_referenced_by_position_deletes
> nit: data_files_referenced_by_position_deletes maybe?
Done


http://gerrit.cloudera.org:8080/#/c/21099/2/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test:

http://gerrit.cloudera.org:8080/#/c/21099/2/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test@421
PS2, Line 421: BUFFERED DELETE FROM ICEBERG 
[functional_parquet.iceberg_v2_partitioned_position_deletes-POSITION-DELETE]
> I'm wondering if getting rid of DELETE FROM ICEBERG in this patch is how ti
I think it's smaller risk to use IcebergBufferedDeleteSink:
* IcebergBufferedDeleteSink is already used for UPDATEs
* It was straightforward to add the new logic to IcebergBufferedDeleteSink
* Developing this funcionality in IcebergDeleteSink has the risk of introducing 
bugs
* Developing this funcionality in IcebergDeleteSink is also redundant as we 
plan to get rid of the whole sink (IMPALA-12640)
* In future it's better to only maintain one delete sink implementation
* we haven't done perf measurements of the operators, as DELETEs are not too 
performance critical. But AFAICT IcebergBufferedDeleteSink should be more 
performant (especially if we remove the SORT node completely in the future).


http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py
File tests/stress/test_update_stress.py:

http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py@269
PS1, Line 269: read also invokes OPTMIZE regularly
> nit: this table name omits optimize. test_concurrent_write_and_optimize?
Done


http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py@277
PS1, Line 277:     num_rows = 10
> Could you add a comment that here you prepare the INSERT query so that thes
It's not crucial for the purpose of the test to do it like this. I only do it 
because this way we spare some time.


http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py@283
PS1, Line 283:
> I know we've been using this variable name for a while now, but I feel that
Done


http://gerrit.cloudera.org:8080/#/c/21099/2/tests/stress/test_update_stress.py
File tests/stress/test_update_stress.py:

http://gerrit.cloudera.org:8080/#/c/21099/2/tests/stress/test_update_stress.py@279
PS2, Line 279:     # Prepare INSERT statement of 'num_rows' records.
> nit: I meant to add it to a comment that this code prefers the INSERT in su
Yeah, I was hesitant to add that part as it is not important for the purpose of 
the test. We could also have separate INSERT statements, and the test would do 
its work. Commenting that might make the impression in the reader that it is 
important to do it that way, and otherwise the test won't work. But we only do 
it that way because to save some time.
However, if you still think it's important that comment, I can add it.



--
To view, visit http://gerrit.cloudera.org:8080/21099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c
Gerrit-Change-Number: 21099
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Mar 2024 08:06:44 +0000
Gerrit-HasComments: Yes

Reply via email to