[
https://issues.apache.org/jira/browse/HIVE-27199?focusedWorklogId=857366&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-857366
]
ASF GitHub Bot logged work on HIVE-27199:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 17/Apr/23 12:54
Start Date: 17/Apr/23 12:54
Worklog Time Spent: 10m
Work Description: zabetak commented on code in PR #4170:
URL: https://github.com/apache/hive/pull/4170#discussion_r1168643738
##########
common/src/java/org/apache/hive/common/util/TimestampParser.java:
##########
@@ -199,6 +205,19 @@ public Timestamp parseTimestamp(final String text) {
}
+ public TimestampTZ parseTimestamp(String text, ZoneId defaultTimeZone) {
+ Objects.requireNonNull(text);
+ for (DateTimeFormatter f : dtFormatters) {
+ try {
+ return TimestampTZUtil.parse(text, defaultTimeZone, f);
+ } catch (DateTimeException e) {
Review Comment:
The `catch` is used to capture the `DateTimeException` and restrain/ignore
it. Without the `catch` the exception will propagate which is not what we want
here; the intention is to try all available parsers till we find one that can
parse the value or till we run out of options.
Issue Time Tracking
-------------------
Worklog Id: (was: 857366)
Time Spent: 1h 20m (was: 1h 10m)
> Read TIMESTAMP WITH LOCAL TIME ZONE columns from text files using custom
> formats
> --------------------------------------------------------------------------------
>
> Key: HIVE-27199
> URL: https://issues.apache.org/jira/browse/HIVE-27199
> Project: Hive
> Issue Type: Improvement
> Components: Serializers/Deserializers
> Affects Versions: 4.0.0-alpha-2
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> Timestamp values come in many flavors and formats and there is no single
> representation that can satisfy everyone especially when such values are
> stored in plain text/csv files.
> HIVE-9298, added a special SERDE property, {{{}timestamp.formats{}}}, that
> allows to provide custom timestamp patterns to parse correctly TIMESTAMP
> values coming from files.
> However, when the column type is TIMESTAMP WITH LOCAL TIME ZONE (LTZ) it is
> not possible to use a custom pattern thus when the built-in Hive parser does
> not match the expected format a NULL value is returned.
> Consider a text file, F1, with the following values:
> {noformat}
> 2016-05-03 12:26:34
> 2016-05-03T12:26:34
> {noformat}
> and a table with a column declared as LTZ.
> {code:sql}
> CREATE TABLE ts_table (ts TIMESTAMP WITH LOCAL TIME ZONE);
> LOAD DATA LOCAL INPATH './F1' INTO TABLE ts_table;
> SELECT * FROM ts_table;
> 2016-05-03 12:26:34.0 US/Pacific
> NULL
> {code}
> In order to give more flexibility to the users relying on the TIMESTAMP WITH
> LOCAL TIME ZONE datatype and also align the behavior with the TIMESTAMP type
> this JIRA aims to reuse the {{timestamp.formats}} property for both TIMESTAMP
> types.
> The work here focuses exclusively on simple text files but the same could be
> done for other SERDE such as JSON etc.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)