[
https://issues.apache.org/jira/browse/HIVE-24814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296225#comment-17296225
]
David Mollitor edited comment on HIVE-24814 at 3/5/21, 5:57 PM:
----------------------------------------------------------------
Here is another example:
{quote}
months_between(date1, date2)
Returns number of months between dates date1 and date2 (as of Hive 1.2.0). If
date1 is later than date2, then the result is positive. If date1 is earlier
than date2, then the result is negative. If date1 and date2 are either the same
days of the month or both last days of months, then the result is always an
integer. Otherwise the UDF calculates the fractional portion of the result
based on a 31-day month and considers the difference in time components date1
and date2. date1 and date2 type can be date, timestamp or string in the format
'yyyy-MM-dd' or 'yyyy-MM-dd HH:mm:ss'. The result is rounded to 8 decimal
places. Example: months_between('1997-02-28 10:30:00', '1996-10-30') =
3.94959677
{quote}
So this one arbitrary states that the Hive format for a Timestamp will work,
even though the method parameters are DATE. Since I am proposing this not be
allowed, users should have to case their String to a Timestamp to a Date so
that it actually matches the method parameter. Alternatively, a months_between
with timestamps can be introduced.
There is code specially in this UDF to handle the TS format,... It tries to
first parse the string as a Timestamp, and if that fails, parse it as a Date.
We have seen this before,... it is *very* slow since every record may end up
throwing an Exception to denote it's not a TS, and then it parses the Date
correctly.
was (Author: belugabehr):
Here is another example:
{quote}
months_between(date1, date2)
Returns number of months between dates date1 and date2 (as of Hive 1.2.0). If
date1 is later than date2, then the result is positive. If date1 is earlier
than date2, then the result is negative. If date1 and date2 are either the same
days of the month or both last days of months, then the result is always an
integer. Otherwise the UDF calculates the fractional portion of the result
based on a 31-day month and considers the difference in time components date1
and date2. date1 and date2 type can be date, timestamp or string in the format
'yyyy-MM-dd' or 'yyyy-MM-dd HH:mm:ss'. The result is rounded to 8 decimal
places. Example: months_between('1997-02-28 10:30:00', '1996-10-30') =
3.94959677
{quote}
So this one arbitrary states that the Hive format for a Timestamp will work,
even though the method parameters are DATE. This is probably only the case
because it relies on the current Date parsing code which allows for such
things. Since I am proposing this not be allowed, users should have to case
their String to a Timestamp to a Date so that it actually matches the method
parameter. Alternatively, a months_between with timestamps can be introduced.
> Harmonize Hive Date-Time Formats
> --------------------------------
>
> Key: HIVE-24814
> URL: https://issues.apache.org/jira/browse/HIVE-24814
> Project: Hive
> Issue Type: Improvement
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Harmonize Hive on JDK date-time formats courtesy of {{DateTimeFormatter}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)