Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11057 )

Change subject: IMPALA-5050: Add support to read TIMESTAMP_MILLIS and 
TIMESTAMP_MICROS from Parquet
......................................................................


Patch Set 12:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/exprs/timezone_db.h
File be/src/exprs/timezone_db.h:

http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/exprs/timezone_db.h@79
PS12, Line 79:
> Add WARN_UNUSED_RESULT.
Done


http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/exprs/timezone_db.h@78
PS12, Line 78: /// Public proxy for LoadZoneInfo. Should be only used in BE 
tests.
             :   static Status LoadZoneInfoBeTestOnly(const std::string& 
zone_info_dir) {
             :     return LoadZoneInfo(zone_info_dir);
             :   }
> Maybe instead of this function add a "friend class TimestampTest;" after L8
I prefer the current solution. I plan to add a similar function that sets the 
local time zone and to call it from expr-test.cc. Adding these test classes as 
friends of TimezoneDatabase seems "less natural" to me (a higher level class 
would be referenced in a low level utility class). Befriending the TimzoneDb 
test classes is ok for me, because they are deeply coupled to this class by 
design.


http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/runtime/timestamp-test.cc@909
PS12, Line 909: CET
> CET is a deprecated name and its usage is not recommended.
See README.


http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/runtime/timestamp-test.cc@928
PS12, Line 928: imestamps that fall to UTC+2->UTC+1 DST change.
> More detailed information would be nice:
Done, thanks for the nice comment. :)


http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/runtime/timestamp-test.cc@950
PS12, Line 950: Timestamp that falls to UTC+1->UTC+2 DST change
> More detailed information would be nice:
Done


http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/runtime/timestamp-value.h@238
PS12, Line 238: any timestamp that is earlier in UTC
> This is a little confusing. Earlier than what?
I hope that it is clearer now.


http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/runtime/timestamp-value.h@239
PS12, Line 239: and any
              :   /// timestamp later in UTC
> Later than what?
See line 238.


http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/11057/12/be/src/runtime/timestamp-value.cc@137
PS12, Line 137: simple
> typo: simply
Done


http://gerrit.cloudera.org:8080/#/c/11057/12/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/11057/12/testdata/data/README@251
PS12, Line 251: CET
> Again, using a real IANA name would be better.
My current opinion is that CET is less likely to cause trouble.
Europe/Budapest is canonical, but is interpreted differently by Hive and Impala 
for historical dates:
select from_utc_timestamp("1800-07-01", "Europe/Budapest");
result in Impala: 1800-07-01 01:16:20
result in Hive: 1800-07-01 01:00:00

Both return 1800-07-01 01:00:00 for CET (and start using DST from 1977).

I do not know about the reason - maybe it is Hive only, or maybe old rule 
changes are removed from the timezone files in some cases. The latter could 
mean that using Europe/Budapest in tests without custom timezone can lead to 
different results depending on the OS. Europe/Budapest is also more likely to 
change in the future (abandoning DST change).


http://gerrit.cloudera.org:8080/#/c/11057/12/testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test
File 
testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test:

http://gerrit.cloudera.org:8080/#/c/11057/12/testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test@4
PS12, Line 4: CET
> It would be better to use an IANA timezone name here and below. CET is a de
See README.



--
To view, visit http://gerrit.cloudera.org:8080/11057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c7c01fffa31b3d2ca3480adf6ff851137dadac3
Gerrit-Change-Number: 11057
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Fri, 12 Oct 2018 14:17:05 +0000
Gerrit-HasComments: Yes

Reply via email to