Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/22293 )
Change subject: IMPALA-13627: Handle legacy Hive timezone conversion ...................................................................... Patch Set 11: (7 comments) http://gerrit.cloudera.org:8080/#/c/22293/11/be/src/benchmarks/convert-timestamp-benchmark.cc File be/src/benchmarks/convert-timestamp-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/22293/11/be/src/benchmarks/convert-timestamp-benchmark.cc@115 PS11, Line 115: FromUnixTimeNanos: Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile : (relative) (relative) (relative) : --------------------------------------------------------------------------------------------------------- : (sec split (old)) 31.2 32 32.4 1X 1X 1X : (sec split (new)) 16.3 16.5 16.6 0.523X 0.517X 0.513X : : FromSubsecondUnixTime: Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile : (relative) (relative) (relative) : --------------------------------------------------------------------------------------------------------- : (old) 32 32.5 33 1X 1X 1X : (new) 14.7 14.9 15.1 0.46X 0.458X 0.458X Why the number looks regress here? Relative of (new) < Relative of (old). Did they switch place? http://gerrit.cloudera.org:8080/#/c/22293/11/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/22293/11/common/thrift/ImpalaService.thrift@981 PS11, Line 981: convert_legacy_hive_parquet_utc_timestamps nit: Please uppercase to highlight that this is an Impala query option. http://gerrit.cloudera.org:8080/#/c/22293/11/docs/topics/impala_timestamp.xml File docs/topics/impala_timestamp.xml: http://gerrit.cloudera.org:8080/#/c/22293/11/docs/topics/impala_timestamp.xml@292 PS11, Line 292: hive.parquet.timestamp.legacy.conversion.enable hive.parquet.timestamp.legacy.conversion.enabled http://gerrit.cloudera.org:8080/#/c/22293/11/tests/query_test/test_hive_timestamp_conversion.py File tests/query_test/test_hive_timestamp_conversion.py: http://gerrit.cloudera.org:8080/#/c/22293/11/tests/query_test/test_hive_timestamp_conversion.py@18 PS11, Line 18: # Tests for Hive-IMPALA compression codec interoperability nit: This is better as code comment for TestHiveParquetTimestampConversion class. Please also mention that convert_legacy_hive_parquet_utc_timestamps, use_legacy_hive_timestamp_conversion, and timezone options are important to exercise, and they are tested at each .test file in this test. http://gerrit.cloudera.org:8080/#/c/22293/11/tests/query_test/test_hive_timestamp_conversion.py@32 PS11, Line 32: def add_test_dimensions(cls): : super(TestHiveParquetTimestampConversion, cls).add_test_dimensions() Individual .test file is short. Therefore, tests in this class seems better to evaluate by querying directly and then validate the rows in the py.test file rather than using run_test_case. The possible results are only 2 set, depending on whether both convert_legacy_hive_parquet_utc_timestamps and use_legacy_hive_timestamp_conversion is all True or not. You can exercise 4 permutations by adding: add_exec_option_dimension(cls, 'convert_legacy_hive_parquet_utc_timestamps', [False, True]) add_exec_option_dimension(cls, 'use_legacy_hive_timestamp_conversion', [False, True]) http://gerrit.cloudera.org:8080/#/c/22293/11/tests/query_test/test_hive_timestamp_conversion.py@55 PS11, Line 55: writer.time.zone What is the value of this metadata in employee_hive_3_1_3_us_pacific.parquet? http://gerrit.cloudera.org:8080/#/c/22293/11/tests/query_test/test_hive_timestamp_conversion.py@67 PS11, Line 67: """Test table containing Hive legacy timestamps written with Hive prior to 3.1.3. : : Test files from : https://github.com/apache/hive/tree/rel/release-4.0.1/data/files/tbl_parq1.""" Can you mention something about timezone="Asia/Singapore" inside timestamp-conversion-hive-3m.test ? -- To view, visit http://gerrit.cloudera.org:8080/22293 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1271ed1da0b74366ab8315e7ec2d4ee47111e067 Gerrit-Change-Number: 22293 Gerrit-PatchSet: 11 Gerrit-Owner: Michael Smith <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Thu, 06 Feb 2025 19:52:08 +0000 Gerrit-HasComments: Yes
