[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error Iceberg tables can be identity partitioned by any type, e.g. int, date and even float. If a table is partitioned, the file path contains the partition value in human readable form, and this form is expected to be passed to CatalogD. When an UPDATE or DELETE command is executed, we don't transform the integer date value to human readable format, which causes errors in CatalogD. With this patch, we transform identity-partitioned date values to human-readable format. Note on floating point numbers: When users partition their data via floating point values (users should not do that), then the file paths created for delete files might not correspond to the data files (e.g. '1.1' vs '1.10023841858'). Though the values are the same in the Iceberg metadata layer, so it doesn't cause correctness issues. Testing: * added e2e tests for DELETEs * added e2e tests for UPDATEs Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Reviewed-on: http://gerrit.cloudera.org:8080/20954 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exec/iceberg-delete-sink-base.cc M be/src/exec/iceberg-delete-sink-base.h M be/src/exec/table-sink-base.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/ScalarType.java M fe/src/main/java/org/apache/impala/catalog/Type.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/test/java/org/apache/impala/util/IcebergUtilTest.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-delete-partitioned.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-update-partitions.test 15 files changed, 160 insertions(+), 74 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 27 Jan 2024 12:27:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10200/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 27 Jan 2024 07:56:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 27 Jan 2024 07:56:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 26 Jan 2024 22:04:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15065/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 26 Jan 2024 15:28:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. Patch Set 2: (3 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/20954/1/be/src/exec/iceberg-delete-sink-base.cc File be/src/exec/iceberg-delete-sink-base.cc: http://gerrit.cloudera.org:8080/#/c/20954/1/be/src/exec/iceberg-delete-sink-base.cc@91 PS1, Line 91: if (IsTimestamp(scalar_type) || IsDateTime(scalar_type)) { > I don't think we need these DCHECKs here since you give an error from the b Done http://gerrit.cloudera.org:8080/#/c/20954/1/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java File fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java: http://gerrit.cloudera.org:8080/#/c/20954/1/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java@21 PS1, Line 21: import org.apache.impala.catalog.ScalarType; > Is this used? Done http://gerrit.cloudera.org:8080/#/c/20954/1/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/20954/1/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@562 PS1, Line 562: transformParam), > nit: For me it was a bit misleading that this param got into the same line Done -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 26 Jan 2024 15:02:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Hello Gabor Kaszab, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20954 to look at the new patch set (#2). Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error Iceberg tables can be identity partitioned by any type, e.g. int, date and even float. If a table is partitioned, the file path contains the partition value in human readable form, and this form is expected to be passed to CatalogD. When an UPDATE or DELETE command is executed, we don't transform the integer date value to human readable format, which causes errors in CatalogD. With this patch, we transform identity-partitioned date values to human-readable format. Note on floating point numbers: When users partition their data via floating point values (users should not do that), then the file paths created for delete files might not correspond to the data files (e.g. '1.1' vs '1.10023841858'). Though the values are the same in the Iceberg metadata layer, so it doesn't cause correctness issues. Testing: * added e2e tests for DELETEs * added e2e tests for UPDATEs Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 --- M be/src/exec/iceberg-delete-sink-base.cc M be/src/exec/iceberg-delete-sink-base.h M be/src/exec/table-sink-base.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/ScalarType.java M fe/src/main/java/org/apache/impala/catalog/Type.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/test/java/org/apache/impala/util/IcebergUtilTest.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-delete-partitioned.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-update-partitions.test 15 files changed, 160 insertions(+), 74 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/20954/2 -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. Patch Set 1: Code-Review+1 (3 comments) Thanks for the quick fix, Zoltan! In general the patch seems fine, I only have some nits. http://gerrit.cloudera.org:8080/#/c/20954/1/be/src/exec/iceberg-delete-sink-base.cc File be/src/exec/iceberg-delete-sink-base.cc: http://gerrit.cloudera.org:8080/#/c/20954/1/be/src/exec/iceberg-delete-sink-base.cc@91 PS1, Line 91: DCHECK(!IsTimestamp(scalar_type)); I don't think we need these DCHECKs here since you give an error from the below if anyway. http://gerrit.cloudera.org:8080/#/c/20954/1/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java File fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java: http://gerrit.cloudera.org:8080/#/c/20954/1/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java@21 PS1, Line 21: import org.apache.impala.catalog.PrimitiveType; Is this used? http://gerrit.cloudera.org:8080/#/c/20954/1/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/20954/1/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@562 PS1, Line 562: transformParam), Type.fromTScalarType(field.getType(; nit: For me it was a bit misleading that this param got into the same line as the end of IcebergPartitionTransform constructor call. Maybe in a new line it would express better for the reader that it belongs to the IcebergPartitionField creation. -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Fri, 26 Jan 2024 11:37:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Thu, 25 Jan 2024 22:07:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15055/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Thu, 25 Jan 2024 17:57:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20954 Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error Iceberg tables can be identity partitioned by any type, e.g. int, date and even float. If a table is partitioned, the file path contains the partition value in human readable form, and this form is expected to be passed to CatalogD. When an UPDATE or DELETE command is executed, we don't transform the integer date value to human readable format, which causes errors in CatalogD. With this patch, we transform identity-partitioned date values to human-readable format. Note on floating point numbers: When users partition their data via floating point values (users should not do that), then the file paths created for delete files might not correspond to the data files (e.g. '1.1' vs '1.10023841858'). Though the values are the same in the Iceberg metadata layer, so it doesn't cause correctness issues. Testing: * added e2e tests for DELETEs * added e2e tests for UPDATEs Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 --- M be/src/exec/iceberg-delete-sink-base.cc M be/src/exec/iceberg-delete-sink-base.h M be/src/exec/table-sink-base.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/ScalarType.java M fe/src/main/java/org/apache/impala/catalog/Type.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/test/java/org/apache/impala/util/IcebergUtilTest.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-delete-partitioned.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-update-partitions.test 15 files changed, 162 insertions(+), 74 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/20954/1 -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20954 ) Change subject: IMPALA-12742: DELETE/UPDATE Iceberg table partitioned by DATE fails with error .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10197/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I506f95527e741efe18c71706e2cdea51b45958b8 Gerrit-Change-Number: 20954 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Thu, 25 Jan 2024 17:36:06 + Gerrit-HasComments: No