Stamatis Zampetakis created HIVE-27199: ------------------------------------------
Summary: Read TIMESTAMP WITH LOCAL TIME ZONE columns from text files using custom formats Key: HIVE-27199 URL: https://issues.apache.org/jira/browse/HIVE-27199 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 4.0.0-alpha-2 Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis Timestamp values come in many flavors and formats and there is no single representation that can satisfy everyone especially when such values are stored in plain text/csv files. HIVE-9298, added a special SERDE property, {{{}timestamp.formats{}}}, that allows to provide custom timestamp patterns to parse correctly TIMESTAMP values coming from files. However, when the column type is TIMESTAMP WITH LOCAL TIME ZONE (LTZ) it is not possible to use a custom pattern thus when the built-in Hive parser does not match the expected format a NULL value is returned. Consider a text file, F1, with the following values: {noformat} 2016-05-03 12:26:34 2016-05-03T12:26:34 {noformat} and a table with a column declared as LTZ. {code:sql} CREATE TABLE ts_table (ts TIMESTAMP WITH LOCAL TIME ZONE); LOAD DATA LOCAL INPATH './F1' INTO TABLE ts_table; SELECT * FROM ts_table; 2016-05-03 12:26:34.0 US/Pacific NULL {code} In order to give more flexibility to the users relying on the TIMESTAMP WITH LOCAL TIME ZONE datatype and also align the behavior with the TIMESTAMP type this JIRA aims to reuse the {{timestamp.formats}} property for both TIMESTAMP types. The work here focuses exclusively on simple text files but the same could be done for other SERDE such as JSON etc. -- This message was sent by Atlassian Jira (v8.20.10#820010)