Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/21492 )
Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu ...................................................................... Patch Set 3: (14 comments) Thanks for the feedback! http://gerrit.cloudera.org:8080/#/c/21492/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21492/3//COMMIT_MSG@16 PS3, Line 16: To be able to read back Kudu tables written by Impala correctly : convert_kudu_utc_timestamps and write_kudu_utc_timestamps need to : have the same value. > Ack. I would prefer to think through the query options later and assume for now the user knows exactly what they are doing if using these options. http://gerrit.cloudera.org:8080/#/c/21492/6/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/21492/6/common/thrift/ImpalaService.thrift@883 PS6, Line 883: conversiyon > nit: conversion Done http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@853 PS6, Line 853: boolean convert_to_utc = > nit: could be camel case Done http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@906 PS6, Line 906: Expr expr = tmpPartitionKeyExprs.get(j); > I think this conversion should take place in the planner instead: during th As we have discussed on chat, I couldn't find any convenient place to do this (though I agree that the current solution is not great). http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java@98 PS6, Line 98: for (Pair<SlotRef, Expr> valueAssignment : modifyStmt_.assignments_) { > nit: could be camel case Done http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/util/ExprUtil.java File fe/src/main/java/org/apache/impala/util/ExprUtil.java: http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/util/ExprUtil.java@104 PS6, Line 104: Preconditions.checkArgument(timestampExpr.isConstant()); > Is this Precondition and another one in L86 correct? I see some examples wh Those are added here: https://github.com/apache/impala/blob/5d1bd80623324f829aca604b25d97ace21f51417/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java#L493 http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@753 PS6, Line 753: evel(TExplainLevel.EXTENDED) > Do you mind adding 1 SELECT test where CONVERT_KUDU_UTC_TIMESTAMPS affect p The only case where CONVERT_KUDU_UTC_TIMESTAMPS affects the plan are bloom filters in runtime filters. There are already tests for that that check this in the profile: https://gerrit.cloudera.org/#/c/20681/22/testdata/workloads/functional-query/queries/QueryTest/kudu_runtime_filter_with_timestamp_conversion.test http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv File testdata/datasets/functional/schema_constraints.csv: http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv@425 PS6, Line 425: # DST changes in the 'America/Los_Angeles' time zone. : table_name:timestamp_at_dst_changes, constraint:restr > Please add in comment: Done http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv@425 PS6, Line 425: # DST changes in the 'America/Los_Angeles' time zone. : table_name:timestamp_at_dst_changes, constraint:restr > It is also good note to mention that during DST, PDT = UTC-7. While outside Done http://gerrit.cloudera.org:8080/#/c/21492/5/testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test File testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test: http://gerrit.cloudera.org:8080/#/c/21492/5/testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test@20 PS5, Line 20: INT > question: This was BIGINT before and it does not break test when timestamp_ These tests were not verified before the change. This is related to the changes in https://gerrit.cloudera.org/#/c/21492/5/tests/common/test_result_verifier.py The last paragraph of the commit message also mentions this. http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test File testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test: http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@31 PS6, Line 31: tkey timestamp primary key, t timestamp, > nit: maybe following name is better for clarity: Done http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@66 PS6, Line 66: > Add "order by id" for readability? Changed the results in other tests to be always ordered by id. I can also add order by if you think that it would make it even cleaner. http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@129 PS6, Line 129: > Do you mind changing this from select count star to select star for clarity Done http://gerrit.cloudera.org:8080/#/c/21492/6/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: http://gerrit.cloudera.org:8080/#/c/21492/6/tests/query_test/test_kudu.py@108 PS6, Line 108: cls.ImpalaTestMatrix.add_mandatory_exec_option('write_kudu_utc_timestamps', 'true') > There is another query option that control timestamp conversion: use_local_ Done -- To view, visit http://gerrit.cloudera.org:8080/21492 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1 Gerrit-Change-Number: 21492 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Ashwani Raina <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Zihao Ye <[email protected]> Gerrit-Comment-Date: Mon, 17 Jun 2024 16:40:27 +0000 Gerrit-HasComments: Yes
