This is an automated email from the ASF dual-hosted git repository. maxgekk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 9f595c4 [SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing 9f595c4 is described below commit 9f595c4ce34728f5d8f943eadea8d85a548b2d41 Author: Max Gekk <max.g...@gmail.com> AuthorDate: Mon Aug 23 13:07:37 2021 +0300 [SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing ### What changes were proposed in this pull request? In the PR, I propose the update the SQL migration guide about the changes introduced by the PRs https://github.com/apache/spark/pull/33709 and https://github.com/apache/spark/pull/33769. <img width="1011" alt="Screenshot 2021-08-23 at 11 40 35" src="https://user-images.githubusercontent.com/1580697/130419710-640f20b3-6a38-4eb1-a6d6-2e069dc5665c.png"> ### Why are the changes needed? To inform users about the upcoming changes in parsing datetime strings. This should help users to migrate on the new release. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By generating the doc, and checking by eyes: ``` $ SKIP_API=1 SKIP_RDOC=1 SKIP_PYTHONDOC=1 SKIP_SCALADOC=1 bundle exec jekyll build ``` Closes #33809 from MaxGekk/datetime-cast-migr-guide. Authored-by: Max Gekk <max.g...@gmail.com> Signed-off-by: Max Gekk <max.g...@gmail.com> --- docs/sql-migration-guide.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index 7ad384f..47e7921 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -26,6 +26,26 @@ license: | - Since Spark 3.3, Spark turns a non-nullable schema into nullable for API `DataFrameReader.schema(schema: StructType).json(jsonDataset: Dataset[String])` and `DataFrameReader.schema(schema: StructType).csv(csvDataset: Dataset[String])` when the schema is specified by the user and contains non-nullable fields. + - Since Spark 3.3, when the date or timestamp pattern is not specified, Spark converts an input string to a date/timestamp using the `CAST` expression approach. The changes affect CSV/JSON datasources and parsing of partition values. In Spark 3.2 or earlier, when the date or timestamp pattern is not set, Spark uses the default patterns: `yyyy-MM-dd` for dates and `yyyy-MM-dd HH:mm:ss` for timestamps. After the changes, Spark still recognizes the pattern together with + + Date patterns: + * `[+-]yyyy*` + * `[+-]yyyy*-[m]m` + * `[+-]yyyy*-[m]m-[d]d` + * `[+-]yyyy*-[m]m-[d]d ` + * `[+-]yyyy*-[m]m-[d]d *` + * `[+-]yyyy*-[m]m-[d]dT*` + + Timestamp patterns: + * `[+-]yyyy*` + * `[+-]yyyy*-[m]m` + * `[+-]yyyy*-[m]m-[d]d` + * `[+-]yyyy*-[m]m-[d]d ` + * `[+-]yyyy*-[m]m-[d]d [h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]` + * `[+-]yyyy*-[m]m-[d]dT[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]` + * `[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]` + * `T[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]` + ## Upgrading from Spark SQL 3.1 to 3.2 - Since Spark 3.2, ADD FILE/JAR/ARCHIVE commands require each path to be enclosed by `"` or `'` if the path contains whitespaces. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org