Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/22293 )
Change subject: IMPALA-13627: Handle legacy Hive timezone conversion ...................................................................... Patch Set 17: (5 comments) http://gerrit.cloudera.org:8080/#/c/22293/14//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22293/14//COMMIT_MSG@35 PS14, Line 35: Adds a new CLI and query option - use_legacy_hive_timestamp_conversion - : to select what conversion method to use in the 3rd case above, when : Impala de > This only applies the third case in the previous paragraph, right? Can you Done http://gerrit.cloudera.org:8080/#/c/22293/11/be/src/benchmarks/convert-timestamp-benchmark.cc File be/src/benchmarks/convert-timestamp-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/22293/11/be/src/benchmarks/convert-timestamp-benchmark.cc@115 PS11, Line 115: FromUnixTimeNanos: Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile : (relative) (relative) (relative) : --------------------------------------------------------------------------------------------------------- : (sec split (old)) 36.8 37.5 39 1X 1X 1X : (sec split (new)) 319 323 324 8.68X 8.61X 8.33X : : FromSubsecondUnixTime: Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile : (relative) (relative) (relative) : --------------------------------------------------------------------------------------------------------- : (old) 37.1 38 38.7 1X 1X 1X : (new) 175 175 177 4.71X 4.6X 4.59X > Thanks for the investigation! It seems that I messed up my own optimization I switched &TimezoneDatabase::GetUtcTimezone to UTCPTR and moved FromUnixTime to allow inlining. That restores comparable benchmark results, which I've updated. http://gerrit.cloudera.org:8080/#/c/22293/14/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/22293/14/be/src/exec/parquet/hdfs-parquet-scanner.cc@3185 PS14, Line 3185: writer_time_zone = kv.value; > Probably not in this patch, but it could be useful to detect the case when I wondered about that, but I think I'd want to investigate Hive's behavior a little more before doing anything with writer.time.zone. http://gerrit.cloudera.org:8080/#/c/22293/14/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/22293/14/fe/src/main/java/org/apache/impala/service/JniFrontend.java@822 PS14, Line 822: // TimeZone.getTimeZone defaults to GMT if it doe > Shouldn't we return an error in this case? Java doesn't really provide a good way to check that. I've updated the comment to be a little clearer about what's happening. http://gerrit.cloudera.org:8080/#/c/22293/14/testdata/workloads/functional-query/queries/QueryTest/timestamp-conversion-hive-313.test File testdata/workloads/functional-query/queries/QueryTest/timestamp-conversion-hive-313.test: http://gerrit.cloudera.org:8080/#/c/22293/14/testdata/workloads/functional-query/queries/QueryTest/timestamp-conversion-hive-313.test@24 PS14, Line 24: SET timezone=PST; > Can you check in one of the test files for the timezone=UTC case, as that w Done -- To view, visit http://gerrit.cloudera.org:8080/22293 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1271ed1da0b74366ab8315e7ec2d4ee47111e067 Gerrit-Change-Number: 22293 Gerrit-PatchSet: 17 Gerrit-Owner: Michael Smith <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Wed, 12 Feb 2025 18:48:42 +0000 Gerrit-HasComments: Yes
