Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15222 )
Change subject: IMPALA-9385: Unix time conversion cleanup + ORC fix ...................................................................... Patch Set 1: (7 comments) Thanks for looking at the patch! Aside from fixing some of the comments, I also did several other changes (they are also listed in the commit message). http://gerrit.cloudera.org:8080/#/c/15222/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15222/2//COMMIT_MSG@7 PS2, Line 7: Orc > nit: ORC (here and below) Done http://gerrit.cloudera.org:8080/#/c/15222/2//COMMIT_MSG@36 PS2, Line 36: Further planned changes > Is it possible to gain some performance with this change? If yes, adding so The performance impact of this change is that if use_local_tz_for_unix_timestamp_conversions or convert_legacy_hive_parquet_utc_timestamps are true and the query option timezone is UTC, then affected functions should be much faster. convert-timestamp-benchmark.cc can give some hints about the difference between different conversions' UTC and non-UTC versions (it is in the range of 3x-20x). The difference during Parquet scanning can be even larger. I plan to update the benchmark when I remove the UTC versions of the From/To functions in TimestampValue. http://gerrit.cloudera.org:8080/#/c/15222/2/be/src/exec/parquet/parquet-common.h File be/src/exec/parquet/parquet-common.h: http://gerrit.cloudera.org:8080/#/c/15222/2/be/src/exec/parquet/parquet-common.h@743 PS2, Line 743: is > nit: typo Done http://gerrit.cloudera.org:8080/#/c/15222/4/be/src/exprs/timezone_db.h File be/src/exprs/timezone_db.h: http://gerrit.cloudera.org:8080/#/c/15222/4/be/src/exprs/timezone_db.h@77 PS4, Line 77: > line has trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/15222/4/be/src/exprs/timezone_db.h@79 PS4, Line 79: static Status LoadZoneInfoBeTestOnly( > line has trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/15222/2/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: http://gerrit.cloudera.org:8080/#/c/15222/2/be/src/runtime/runtime-state.cc@150 PS2, Line 150: nullptr > nit: maybe it would be worth to mention here as a side note that UTCPTR is Done http://gerrit.cloudera.org:8080/#/c/15222/2/be/src/runtime/runtime-state.cc@152 PS2, Line 152: local_time_zone_ = tz == &TimezoneDatabase::GetUtcTimezone() ? UTCPTR : tz; > Not related to this point, but as I see, the only other place (outside of s Using nullptr instead of a real Timezone object is only useful in performance critical parts - mainly those function that can run for every row of a query. I don't want to remove TimezoneDatabase::GetUtcTimezone(), as it can be useful to have the Timezone object, e.g. to get it's name. -- To view, visit http://gerrit.cloudera.org:8080/15222 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14e2a7e512ccd013d5d9fe480a5467ed4c46b76e Gerrit-Change-Number: 15222 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Norbert Luksa <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Mon, 17 Feb 2020 15:19:59 +0000 Gerrit-HasComments: Yes
