[jira] [Work logged] (HIVE-25292) to_unix_timestamp & unix_timestamp should support ENGLISH format by default

ASF GitHub Bot (Jira) Mon, 13 Dec 2021 04:33:06 -0800


     [ 
https://issues.apache.org/jira/browse/HIVE-25292?focusedWorklogId=695010&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695010
 ]


ASF GitHub Bot logged work on HIVE-25292:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Dec/21 12:32
            Start Date: 13/Dec/21 12:32
    Worklog Time Spent: 10m 
      Work Description: shezhiming commented on pull request #2433:
URL: https://github.com/apache/hive/pull/2433#issuecomment-992432397


   @kgyrtkirk 
   
   My initial thoughts were the same as you, but in this way you need to create 
two DateTimeFormatters, and it seems that you can only use try{} catch{} to 
control the code logic (or is there a better way ?), which is not very elegant.
   
   Another reason is that I found in Apache Spark that spark to_unix_timestamp 
udf uses US Local to format the time by default.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 695010)
    Time Spent: 1h  (was: 50m)

> to_unix_timestamp & unix_timestamp should support ENGLISH format by default
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-25292
>                 URL: https://issues.apache.org/jira/browse/HIVE-25292
>             Project: Hive
>          Issue Type: Improvement
>          Components: Clients
>            Reporter: shezm
>            Assignee: shezm
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.2.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Hei
> The to_unix_timestamp function is implemented by GenericUDFToUnixTimeStamp. 
> It uses SimpleDateFormat to parse the time of the string type.
> But SimpleDateFormat does not specify the Locale parameter, that is, the 
> default locale of the jvm machine will be used. This will cause some 
> non-English local machines to be unable to run similar sql like :
>  
> {code:java}
> hive> select to_unix_timestamp('16/Mar/2017:12:25:01', 'dd/MMM/yyy:HH:mm:ss');
> OK
> NULL
> hive> select unix_timestamp('16/Mar/2017:12:25:01', 'dd/MMM/yyy:HH:mm:ss');
> OK
> NULL
> {code}
>  
> At the same time, I found that in spark, to_unix_timestamp & unix_timestamp 
> also use SimpleDateFormat, and spark uses Locale.US by default, but this will 
> make it impossible to use local language syntax. For example, in the Chinese 
> environment, I can parse this result correctly in hive,
>  
> {code:java}
> hive> select to_unix_timestamp('16/三月/2017:12:25:01', 'dd/MMMM/yyy:HH:mm:ss');
> OK
> 1489638301
> Time taken: 0.147 seconds, Fetched: 1 row(s)
> OK
> {code}
> But spark will return Null.
> Because English dates are more common dates, I think two SimpleDateFormats 
> are needed. The new SimpleDateFormat is initialized with the Locale.ENGLISH 
> parameter.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work logged] (HIVE-25292) to_unix_timestamp & unix_timestamp should support ENGLISH format by default

Reply via email to