Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15222 )
Change subject: IMPALA-9385: Unix time conversion cleanup + ORC fix ...................................................................... Patch Set 7: (10 comments) http://gerrit.cloudera.org:8080/#/c/15222/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15222/6//COMMIT_MSG@13 PS6, Line 13: is was was no ? http://gerrit.cloudera.org:8080/#/c/15222/6//COMMIT_MSG@26 PS6, Line 26: that nit: not necessary http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/common/global-types.h File be/src/common/global-types.h: http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/common/global-types.h@31 PS6, Line 31: #define UTCPTR nullptr Why no define a const Timezone* instead? http://gerrit.cloudera.org:8080/#/c/15222/7/be/src/exec/data-source-scan-node.cc File be/src/exec/data-source-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/15222/7/be/src/exec/data-source-scan-node.cc@352 PS7, Line 352: // TODO The timezone depends on flag use_local_tz_for_unix_timestamp_conversions. : // Check if this is the intended behaviour. : RETURN_IF_ERROR(MaterializeNextRow( : state->time_zone_for_unix_time_conversions(), tuple_pool, tuple)); You're raising a good point in the comment. I think here we should just pass state->local_time_zone(). http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/exprs/expr-test.cc@170 PS6, Line 170: Returning const Timezone* here might simplify things a bit. http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/runtime/timestamp-value.h@102 PS6, Line 102: 'unix_time' is assumed to be UTC nit: The comment is a bit confusing. 'unix_time' is always in UTC (by definition) not just when 'local_tz' is set to non-UTC. http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-local-tz-conversion.test File testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-local-tz-conversion.test: http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-local-tz-conversion.test@3 PS6, Line 3: This test is also called with convert_legacy_hive_parquet_utc_timestamps=true. Comment is confusing: I think the test is only called with convert_legacy_hive_parquet_utc_timestamps=true. http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test File testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test: http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test@18 PS6, Line 18: ==== : ---- QUERY Move the new sections to a separate .test file or rename this file to something more appropriate. Originally this test file was meant for testing UTC timestamp functions only. http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test@45 PS6, Line 45: ---- QUERY : SET timezone=CET; : select min(timestamp_col) from functional_avro.alltypestiny; : ---- TYPES : STRING : ---- RESULTS : '2009-01-01 00:00:00' Since functional_avro.alltypestiny.timestamp_col is a string so probably you can remove this section. http://gerrit.cloudera.org:8080/#/c/15222/6/tests/custom_cluster/test_hive_parquet_timestamp_conversion.py File tests/custom_cluster/test_hive_parquet_timestamp_conversion.py: http://gerrit.cloudera.org:8080/#/c/15222/6/tests/custom_cluster/test_hive_parquet_timestamp_conversion.py@74 PS6, Line 74: self.check_sanity(True) : # Test with UTC too to check the optimizations added in IMPALA-9385. : for tz_name in ["PST8PDT", "UTC"]: : # The value read from the Hive table should be the same as reading a UTC converted : # value from the Impala table. : data = self.execute_query_expect_success(self.client, """ : SELECT h.id, h.day, h.timestamp_col, i.timestamp_col : FROM functional_parquet.alltypesagg_hive_13_1 h : JOIN functional_parquet.alltypesagg : i ON i.id = h.id AND i.day = h.day -- serves as a unique key : WHERE : (h.timestamp_col IS NULL AND i.timestamp_col IS NOT NULL) : OR (h.timestamp_col IS NOT NULL AND i.timestamp_col IS NULL) : OR h.timestamp_col != FROM_UTC_TIMESTAMP(i.timestamp_col, '%s') : """ % tz_name, query_options={"timezone": tz_name})\ : .get_data() : assert len(data) == 0 Any reason you moved this block after L73? I think, check_sanity() should be called at the beginning of the test. -- To view, visit http://gerrit.cloudera.org:8080/15222 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14e2a7e512ccd013d5d9fe480a5467ed4c46b76e Gerrit-Change-Number: 15222 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Norbert Luksa <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Tue, 18 Feb 2020 18:33:12 +0000 Gerrit-HasComments: Yes
