Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20711 )
Change subject: IMPALA-12557: DELETE throws DateTimeParseException when deleting from time-partitioned table ...................................................................... Patch Set 3: (6 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/exec/iceberg-delete-sink.cc File be/src/exec/iceberg-delete-sink.cc: http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/exec/iceberg-delete-sink.cc@321 PS2, Line 321: if (!transform_status.ok()) return transform_status; > If HumanReadablePartitionValue returned Status then we could use RETURN_IF_ I started with RETURN_IF_ERROR, but the parameters were a bit awkward as we had an input parameter 'partition_str' and output parameter 'human_readable_str', but in this case we would pass 'value_str' in both, which would make the implementation a bit errorprone due to the aliasing. Or, we could have a single in/out parameter but I didn't like that. This signature made more sense to me, also it is not completely unprecedented in Impala, e.g. StringParser uses similar signatures. http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions-test.cc File be/src/util/iceberg-utility-functions-test.cc: http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions-test.cc@28 PS2, Line 28: EXPECT_EQ("1969", HumanReadableYear(-1)); > All of these new functions get an int64 as param. Would it make sense to te We could, but then we would get a signed overflow which is undefined behavior, so not really appropriate for testing. We could have DCHECKs to limit the accepted values, but I don't think they would have real added value as they would only check DEBUG mode, and invalid values can be observed anytime. I think it would probably make sense to have such extreme values in actual Iceberg metadata, and have e2e tests for them, ensuring that at least they don't crash Impala. It's not easy to write such data without manually modifying the manifest files, as the date/timestamp range is much narrower. We could probably open a Jira about corrupt Iceberg table testing. http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.h File be/src/util/iceberg-utility-functions.h: http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.h@37 PS2, Line 37: std::string HumanReadableYear(int32_t part_value); > nit: could you extend these comments with examples of these human readable Done http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.h@52 PS2, Line 52: > Don't we usually return Status as a return value and the actual return valu See my other comment. http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.cc File be/src/util/iceberg-utility-functions.cc: http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.cc@33 PS2, Line 33: int32_t > An int32 would be enough here I think (also for the addition in L55 should Yeah we can do that for all transforms actually, Iceberg also uses 4-byte integers for these transforms. http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.cc@76 PS2, Line 76: if (parse_result != StringParser::ParseResult::PARSE_SUCCESS) { > I think the status should be the return value of this function, but if we d Added DCHECK. -- To view, visit http://gerrit.cloudera.org:8080/20711 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1cfeaed6409289663eb0f65b1ee2ecebd93e6118 Gerrit-Change-Number: 20711 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 22 Nov 2023 13:33:14 +0000 Gerrit-HasComments: Yes
