Kent Yao created SPARK-31879: -------------------------------- Summary: First day of week changed for non-MONDAY_START Lacales Key: SPARK-31879 URL: https://issues.apache.org/jira/browse/SPARK-31879 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.0.0, 3.1.0 Reporter: Kent Yao
h1. cases {code:sql} spark-sql> select to_timestamp('2020-1-1', 'YYYY-w-u'); 2019-12-29 00:00:00 spark-sql> set spark.sql.legacy.timeParserPolicy=legacy; spark.sql.legacy.timeParserPolicy legacy spark-sql> select to_timestamp('2020-1-1', 'YYYY-w-u'); 2019-12-30 00:00:00 {code} h1. reasons These week-based fields need Locale to express their semantics, the first day of the week varies from country to country. >From the Java doc of WeekFields {code:java} /** * Gets the first day-of-week. * <p> * The first day-of-week varies by culture. * For example, the US uses Sunday, while France and the ISO-8601 standard use Monday. * This method returns the first day using the standard {@code DayOfWeek} enum. * * @return the first day-of-week, not null */ public DayOfWeek getFirstDayOfWeek() { return firstDayOfWeek; } {code} But for the SimpleDateFormat, the day-of-week is not localized ``` u Day number of week (1 = Monday, ..., 7 = Sunday) Number 1 ``` Currently, the default locale we use is the US, so the result moved a day backward. For other countries, please refer to [First Day of the Week in Different Countries|http://chartsbin.com/view/41671] h1. solution options 1. Use new Locale("en", "GB") as default locale. 2. For JDK10 and onwards, we can set locale Unicode extension 'fw' to 'mon', but not work for lower JDKs 3. Forbid 'u', give user proper exceptions, and enable and document 'e/c'. Currently, the 'u' is internally substituted by 'e', but they are not equivalent. 1 and 2 can solve this with default locale but not for the functions with custom locale supported. cc [~cloud_fan] [~dongjoon] [~maropu] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org