Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15222 )

Change subject: IMPALA-9385: Unix time conversion cleanup + ORC fix
......................................................................


Patch Set 6:

(10 comments)

Thanks for the review!
I fixed the ones I were sure about, will return to some of the more complex 
ones.

http://gerrit.cloudera.org:8080/#/c/15222/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15222/6//COMMIT_MSG@13
PS6, Line 13: is was
> was no ?
Done


http://gerrit.cloudera.org:8080/#/c/15222/6//COMMIT_MSG@26
PS6, Line 26: that
> nit: not necessary
Done


http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/common/global-types.h
File be/src/common/global-types.h:

http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/common/global-types.h@31
PS6, Line 31: #define UTCPTR nullptr
> Why no define a const Timezone* instead?
It led to compile error for some reason. I will try it again, probably it 
should be const Timezone * const


http://gerrit.cloudera.org:8080/#/c/15222/7/be/src/exec/data-source-scan-node.cc
File be/src/exec/data-source-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/15222/7/be/src/exec/data-source-scan-node.cc@352
PS7, Line 352:         // TODO The timezone depends on flag 
use_local_tz_for_unix_timestamp_conversions.
             :         //      Check if this is the intended behaviour.
             :         RETURN_IF_ERROR(MaterializeNextRow(
             :             state->time_zone_for_unix_time_conversions(), 
tuple_pool, tuple));
> You're raising a good point in the comment. I think here we should just pas
I was thinking about UTCPTR instead. Using local_time_zone() would mean that we 
will do a timezone conversion by default, which is not how Impala used to work.

Note that I am not sure whether anyone cares about the correctness of 
DataSource.


http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/exprs/expr-test.cc@170
PS6, Line 170: const Timezone&
> Returning const Timezone* here might simplify things a bit.
This cannot return nullptr, as it is also used in UtcToLocal() which expects a 
Timezone reference, so I think that it is clearer to keep it this way.


http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/runtime/timestamp-value.h@102
PS6, Line 102: 'unix_time' is assumed to be UTC
> nit: The comment is a bit confusing. 'unix_time' is always in UTC (by defin
I didn't find a good name for this offset when used in timezone agnostic 
context. E.g. both Parquet and Orc store such offsets that behave exactly like 
Unix time, while assume the resulting timestamp to be in local time. We 
consider this the "normal" case and pass a pointer to a Timezone when the 
unix_time is real UTC Unix time and we need to convert it to local time.


http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-local-tz-conversion.test
File 
testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-local-tz-conversion.test:

http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-local-tz-conversion.test@3
PS6, Line 3: This test is also called with 
convert_legacy_hive_parquet_utc_timestamps=true.
> Comment is confusing: I think the test is only called with convert_legacy_h
You are right, I confused this test with another one.


http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test
File 
testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test:

http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test@18
PS6, Line 18: ====
            : ---- QUERY
> Move the new sections to a separate .test file or rename this file to somet
Done


http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test@45
PS6, Line 45: ---- QUERY
            : SET timezone=CET;
            : select min(timestamp_col) from functional_avro.alltypestiny;
            : ---- TYPES
            : STRING
            : ---- RESULTS
            : '2009-01-01 00:00:00'
> Since functional_avro.alltypestiny.timestamp_col is a string so probably yo
I kept the test - if we add timestamp support for avro one day, it will be easy 
to forget about checking this.


http://gerrit.cloudera.org:8080/#/c/15222/6/tests/custom_cluster/test_hive_parquet_timestamp_conversion.py
File tests/custom_cluster/test_hive_parquet_timestamp_conversion.py:

http://gerrit.cloudera.org:8080/#/c/15222/6/tests/custom_cluster/test_hive_parquet_timestamp_conversion.py@74
PS6, Line 74:     self.check_sanity(True)
            :     # Test with UTC too to check the optimizations added in 
IMPALA-9385.
            :     for tz_name in ["PST8PDT", "UTC"]:
            :       # The value read from the Hive table should be the same as 
reading a UTC converted
            :       # value from the Impala table.
            :       data = self.execute_query_expect_success(self.client, """
            :           SELECT h.id, h.day, h.timestamp_col, i.timestamp_col
            :           FROM functional_parquet.alltypesagg_hive_13_1 h
            :           JOIN functional_parquet.alltypesagg
            :             i ON i.id = h.id AND i.day = h.day  -- serves as a 
unique key
            :           WHERE
            :             (h.timestamp_col IS NULL AND i.timestamp_col IS NOT 
NULL)
            :             OR (h.timestamp_col IS NOT NULL AND i.timestamp_col 
IS NULL)
            :             OR h.timestamp_col != 
FROM_UTC_TIMESTAMP(i.timestamp_col, '%s')
            :           """ % tz_name, query_options={"timezone": tz_name})\
            :           .get_data()
            :       assert len(data) == 0
> Any reason you moved this block after L73?
Done



--
To view, visit http://gerrit.cloudera.org:8080/15222
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14e2a7e512ccd013d5d9fe480a5467ed4c46b76e
Gerrit-Change-Number: 15222
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Norbert Luksa <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Tue, 18 Feb 2020 20:43:07 +0000
Gerrit-HasComments: Yes

Reply via email to