Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20711 )

Change subject: IMPALA-12557: DELETE throws DateTimeParseException when 
deleting from time-partitioned table
......................................................................


Patch Set 3:

(6 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/exec/iceberg-delete-sink.cc
File be/src/exec/iceberg-delete-sink.cc:

http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/exec/iceberg-delete-sink.cc@321
PS2, Line 321:     if (!transform_status.ok()) return transform_status;
> If HumanReadablePartitionValue returned Status then we could use RETURN_IF_
I started with RETURN_IF_ERROR, but the parameters were a bit awkward as we had 
an input parameter 'partition_str' and output parameter 'human_readable_str', 
but in this case we would pass 'value_str' in both, which would make the 
implementation a bit errorprone due to the aliasing. Or, we could have a single 
in/out parameter but I didn't like that.

This signature made more sense to me, also it is not completely unprecedented 
in Impala, e.g. StringParser uses similar signatures.


http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions-test.cc
File be/src/util/iceberg-utility-functions-test.cc:

http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions-test.cc@28
PS2, Line 28:   EXPECT_EQ("1969", HumanReadableYear(-1));
> All of these new functions get an int64 as param. Would it make sense to te
We could, but then we would get a signed overflow which is undefined behavior, 
so not really appropriate for testing.
We could have DCHECKs to limit the accepted values, but I don't think they 
would have real added value as they would only check DEBUG mode, and invalid 
values can be observed anytime.

I think it would probably make sense to have such extreme values in actual 
Iceberg metadata, and have e2e tests for them, ensuring that at least they 
don't crash Impala. It's not easy to write such data without manually modifying 
the manifest files, as the date/timestamp range is much narrower. We could 
probably open a Jira about corrupt Iceberg table testing.


http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.h
File be/src/util/iceberg-utility-functions.h:

http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.h@37
PS2, Line 37: std::string HumanReadableYear(int32_t part_value);
> nit: could you extend these comments with examples of these human readable
Done


http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.h@52
PS2, Line 52:
> Don't we usually return Status as a return value and the actual return valu
See my other comment.


http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.cc
File be/src/util/iceberg-utility-functions.cc:

http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.cc@33
PS2, Line 33: int32_t
> An int32 would be enough here I think (also for the addition in L55 should
Yeah we can do that for all transforms actually, Iceberg also uses 4-byte 
integers for these transforms.


http://gerrit.cloudera.org:8080/#/c/20711/2/be/src/util/iceberg-utility-functions.cc@76
PS2, Line 76:   if (parse_result != StringParser::ParseResult::PARSE_SUCCESS) {
> I think the status should be the return value of this function, but if we d
Added DCHECK.



--
To view, visit http://gerrit.cloudera.org:8080/20711
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cfeaed6409289663eb0f65b1ee2ecebd93e6118
Gerrit-Change-Number: 20711
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Wed, 22 Nov 2023 13:33:14 +0000
Gerrit-HasComments: Yes

Reply via email to