[
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stamatis Zampetakis updated HIVE-25576:
---------------------------------------
Summary: Configurable datetime formatter for unix_timestamp, from_unixtime
(was: Add config to parse date with older date format)
> Configurable datetime formatter for unix_timestamp, from_unixtime
> -----------------------------------------------------------------
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
> Issue Type: Improvement
> Affects Versions: 3.1.0, 3.0.0, 3.1.1, 3.1.2, 4.0.0
> Reporter: Ashish Sharma
> Assignee: Stamatis Zampetakis
> Priority: Major
> Labels: pull-request-available
> Time Spent: 3h
> Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* -
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00
> UTC','yyyy-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* -
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official
> documentation they have mention that "Unfortunately, the API for these
> functions was not amenable to internationalization and The corresponding
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* -
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00
> UTC','yyyy-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* -
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime =
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*-
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is
> giving the correct result but it is not backword compatible. Which is causing
> issue at time for migration to new version. Because the older data written is
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> 1. *True*- use *SimpleDateFormat*
> 2. *False* - use *DateTimeFormatter*
> Note: apache spark also face the same issue
> https://issues.apache.org/jira/browse/SPARK-30668
--
This message was sent by Atlassian Jira
(v8.20.10#820010)