[ 
https://issues.apache.org/jira/browse/SPARK-31879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reopened SPARK-31879:
---------------------------------
      Assignee: Kent Yao  (was: Wenchen Fan)

> First day of week changed for non-MONDAY_START Lacales
> ------------------------------------------------------
>
>                 Key: SPARK-31879
>                 URL: https://issues.apache.org/jira/browse/SPARK-31879
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.0.0, 3.1.0
>            Reporter: Kent Yao
>            Assignee: Kent Yao
>            Priority: Blocker
>
> h1. cases
> {code:sql}
> spark-sql> select to_timestamp('2020-1-1', 'YYYY-w-u');
> 2019-12-29 00:00:00
> spark-sql> set spark.sql.legacy.timeParserPolicy=legacy;
> spark.sql.legacy.timeParserPolicy     legacy
> spark-sql> select to_timestamp('2020-1-1', 'YYYY-w-u');
> 2019-12-30 00:00:00
> {code}
> h1. reasons
> These week-based fields need Locale to express their semantics, the first day 
> of the week varies from country to country.
> From the Java doc of WeekFields
> {code:java}
>     /**
>      * Gets the first day-of-week.
>      * <p>
>      * The first day-of-week varies by culture.
>      * For example, the US uses Sunday, while France and the ISO-8601 
> standard use Monday.
>      * This method returns the first day using the standard {@code DayOfWeek} 
> enum.
>      *
>      * @return the first day-of-week, not null
>      */
>     public DayOfWeek getFirstDayOfWeek() {
>         return firstDayOfWeek;
>     }
> {code}
> But for the SimpleDateFormat, the day-of-week is not localized
> ```
> u     Day number of week (1 = Monday, ..., 7 = Sunday)        Number  1
> ```
> Currently, the default locale we use is the US, so the result moved a day 
> backward.
> For other countries, please refer to [First Day of the Week in Different 
> Countries|http://chartsbin.com/view/41671]
> h1. solution options
> 1. Use new Locale("en", "GB") as default locale.
> 2. For JDK10 and onwards, we can set locale Unicode extension 'fw'  to 'mon', 
> but not work for lower JDKs
> 3. Forbid 'u', give user proper exceptions, and enable and document 'e/c'. 
> Currently, the 'u' is internally substituted by 'e', but they are not 
> equivalent.
> 1 and 2 can solve this with default locale but not for the functions with 
> custom locale supported.
> cc [~cloud_fan] [~dongjoon] [~maropu]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to