Vincenz Priesnitz created HIVE-24353:
----------------------------------------
Summary: performance: Refactor TimestampTZ parsing
Key: HIVE-24353
URL: https://issues.apache.org/jira/browse/HIVE-24353
Project: Hive
Issue Type: Improvement
Reporter: Vincenz Priesnitz
I found that for datasets that contain a lot of timestamps (without timezones)
hive spends the majority of time in TimestampTZUtil.parse, in particular
constructing stractraces for the try-catch blocks.
When parsing TimestampTZ we are currently using a fallback chain with several
try-catch blocks. For a common timestamp string without a timezone, we
currently throw and catch 2 exceptions, and actually parse the string twice.
I propose a refactor, that parses the string once and then expresses the
fallback chain with queries to the parsed TemporalAccessor.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)