[
https://issues.apache.org/jira/browse/HIVE-25292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
shezm reassigned HIVE-25292:
----------------------------
Assignee: shezm
> to_unix_timestamp & unix_timestamp should support ENGLISH format by default
> ---------------------------------------------------------------------------
>
> Key: HIVE-25292
> URL: https://issues.apache.org/jira/browse/HIVE-25292
> Project: Hive
> Issue Type: Improvement
> Components: Clients
> Reporter: shezm
> Assignee: shezm
> Priority: Major
> Fix For: 3.2.0
>
>
> Hei
> The to_unix_timestamp function is implemented by GenericUDFToUnixTimeStamp.
> It uses SimpleDateFormat to parse the time of the string type.
> But SimpleDateFormat does not specify the Locale parameter, that is, the
> default locale of the jvm machine will be used. This will cause some
> non-English local machines to be unable to run similar sql like :
>
> {code:java}
> hive> select to_unix_timestamp('16/Mar/2017:12:25:01', 'dd/MMM/yyy:HH:mm:ss');
> OK
> NULLhive> select unix_timestamp('16/Mar/2017:12:25:01',
> 'dd/MMM/yyy:HH:mm:ss');
> OK
> NULL
> {code}
>
> At the same time, I found that in spark, to_unix_timestamp & unix_timestamp
> also use SimpleDateFormat, and spark uses Locale.US by default, but this will
> make it impossible to use local language syntax. For example, in the Chinese
> environment, I can parse this result correctly in hive,
>
> {code:java}
> hive> select to_unix_timestamp('16/三月/2017:12:25:01', 'dd/MMMM/yyy:HH:mm:ss');
> OK
> 1489638301
> Time taken: 0.147 seconds, Fetched: 1 row(s)
> OK
> NULL
> {code}
> But spark will return Null.
> Because English dates are more common dates, I think two SimpleDateFormats
> are needed. The new SimpleDateFormat is initialized with the Locale.ENGLISH
> parameter.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)