[ 
https://issues.apache.org/jira/browse/SPARK-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-20166:
---------------------------------
    Description: 
We can use {{XXX}} format instead of {{ZZ}}. {{ZZ}} seems a {{FastDateFormat}} 
specific Please see 
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#iso8601timezone
 and 
https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/time/FastDateFormat.html

{{ZZ}} supports "ISO 8601 extended format time zones" but it seems 
{{FastDateFormat}} specific option.

It seems we better replace {{ZZ}} to {{XXX}} because they look use the same 
strategy - 
https://github.com/apache/commons-lang/blob/8767cd4f1a6af07093c1e6c422dae8e574be7e5e/src/main/java/org/apache/commons/lang3/time/FastDateParser.java#L930.
 

I also checked the codes and manually debugged it for sure. It seems both cases 
use the same pattern {code}( Z|(?:[+-]\\d{2}(?::)\\d{2})) {code}.

Note that this is a fix about documentation not the behaviour change because 
{{ZZ}} seems invalid date format in {{SimpleDateFormat}} as documented in 
{{DataFrameReader}}:

{quote}
   * <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSZZ`): sets the 
string that
   * indicates a timestamp format. Custom date formats follow the formats at
   * `java.text.SimpleDateFormat`. This applies to timestamp type.</li>
{quote}


{code}
scala> new 
java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00")
res4: java.util.Date = Tue Mar 21 20:00:00 KST 2017

scala>  new 
java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z")
res10: java.util.Date = Tue Mar 21 09:00:00 KST 2017

scala> new 
java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00")
java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000-11:00"
  at java.text.DateFormat.parse(DateFormat.java:366)
  ... 48 elided
scala>  new 
java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z")
java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000Z"
  at java.text.DateFormat.parse(DateFormat.java:366)
  ... 48 elided
{code}

{code}
scala> 
org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00")
res7: java.util.Date = Tue Mar 21 20:00:00 KST 2017

scala> 
org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z")
res1: java.util.Date = Tue Mar 21 09:00:00 KST 2017

scala> 
org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00")
res8: java.util.Date = Tue Mar 21 20:00:00 KST 2017

scala> 
org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z")
res2: java.util.Date = Tue Mar 21 09:00:00 KST 2017
{code}

  was:
We can use {{XXX}} format instead of {{ZZ}}. {{ZZ}} seems a {{FastDateFormat}} 
specific Please see 
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#iso8601timezone
 and 
https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/time/FastDateFormat.html

{{ZZ}} supports "ISO 8601 extended format time zones" but it seems 
{{FastDateFormat}} specific option.

It seems we better replace {{ZZ}} to {{XXX}} because they look use the same 
strategy - 
https://github.com/apache/commons-lang/blob/8767cd4f1a6af07093c1e6c422dae8e574be7e5e/src/main/java/org/apache/commons/lang3/time/FastDateParser.java#L930.
 

I also checked the codes and manually debugged it for sure. It seems both cases 
use the same patter {{"(Z|(?:[+-]\\d{2}(?::)\\d{2}))"}}.

Note that this is a fix about documentation not the behaviour change because 
{{ZZ}} seems invalid date format in {{SimpleDateFormat}} as documented in 
{{DataFrameReader}}:

{quote}
   * <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSZZ`): sets the 
string that
   * indicates a timestamp format. Custom date formats follow the formats at
   * `java.text.SimpleDateFormat`. This applies to timestamp type.</li>
{quote}


{code}
scala> new 
java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00")
res4: java.util.Date = Tue Mar 21 20:00:00 KST 2017

scala>  new 
java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z")
res10: java.util.Date = Tue Mar 21 09:00:00 KST 2017

scala> new 
java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00")
java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000-11:00"
  at java.text.DateFormat.parse(DateFormat.java:366)
  ... 48 elided
scala>  new 
java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z")
java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000Z"
  at java.text.DateFormat.parse(DateFormat.java:366)
  ... 48 elided
{code}

{code}
scala> 
org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00")
res7: java.util.Date = Tue Mar 21 20:00:00 KST 2017

scala> 
org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z")
res1: java.util.Date = Tue Mar 21 09:00:00 KST 2017

scala> 
org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00")
res8: java.util.Date = Tue Mar 21 20:00:00 KST 2017

scala> 
org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z")
res2: java.util.Date = Tue Mar 21 09:00:00 KST 2017
{code}


> Use XXX for ISO timezone instead of ZZ which is FastDateFormat specific in 
> CSV/JSON time related options
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-20166
>                 URL: https://issues.apache.org/jira/browse/SPARK-20166
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Hyukjin Kwon
>            Priority: Trivial
>
> We can use {{XXX}} format instead of {{ZZ}}. {{ZZ}} seems a 
> {{FastDateFormat}} specific Please see 
> https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#iso8601timezone
>  and 
> https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/time/FastDateFormat.html
> {{ZZ}} supports "ISO 8601 extended format time zones" but it seems 
> {{FastDateFormat}} specific option.
> It seems we better replace {{ZZ}} to {{XXX}} because they look use the same 
> strategy - 
> https://github.com/apache/commons-lang/blob/8767cd4f1a6af07093c1e6c422dae8e574be7e5e/src/main/java/org/apache/commons/lang3/time/FastDateParser.java#L930.
>  
> I also checked the codes and manually debugged it for sure. It seems both 
> cases use the same pattern {code}( Z|(?:[+-]\\d{2}(?::)\\d{2})) {code}.
> Note that this is a fix about documentation not the behaviour change because 
> {{ZZ}} seems invalid date format in {{SimpleDateFormat}} as documented in 
> {{DataFrameReader}}:
> {quote}
>    * <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSZZ`): sets the 
> string that
>    * indicates a timestamp format. Custom date formats follow the formats at
>    * `java.text.SimpleDateFormat`. This applies to timestamp type.</li>
> {quote}
> {code}
> scala> new 
> java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00")
> res4: java.util.Date = Tue Mar 21 20:00:00 KST 2017
> scala>  new 
> java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z")
> res10: java.util.Date = Tue Mar 21 09:00:00 KST 2017
> scala> new 
> java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00")
> java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000-11:00"
>   at java.text.DateFormat.parse(DateFormat.java:366)
>   ... 48 elided
> scala>  new 
> java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z")
> java.text.ParseException: Unparseable date: "2017-03-21T00:00:00.000Z"
>   at java.text.DateFormat.parse(DateFormat.java:366)
>   ... 48 elided
> {code}
> {code}
> scala> 
> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000-11:00")
> res7: java.util.Date = Tue Mar 21 20:00:00 KST 2017
> scala> 
> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSXXX").parse("2017-03-21T00:00:00.000Z")
> res1: java.util.Date = Tue Mar 21 09:00:00 KST 2017
> scala> 
> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000-11:00")
> res8: java.util.Date = Tue Mar 21 20:00:00 KST 2017
> scala> 
> org.apache.commons.lang3.time.FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ").parse("2017-03-21T00:00:00.000Z")
> res2: java.util.Date = Tue Mar 21 09:00:00 KST 2017
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to