[ https://issues.apache.org/jira/browse/SPARK-27790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304965#comment-17304965 ]
Max Gekk commented on SPARK-27790: ---------------------------------- > how will DateTime addition work with the new intervals? Something like this? [~mrpowers] You are right. We should add such functions(constructors) for both new types. The arithmetic operations have been already supported by https://github.com/apache/spark/pull/31832 and https://github.com/apache/spark/pull/31855. Regarding to the existing function make_interval(), maybe we could add a rule (under a SQL flag) which splits it to make_year_month_interval() and make_date_time_interval(). > Does ANSI SQL allow operations on dates using the YEAR-MONTH interval type? [~simeons] As far as I know, yes, it does. This has been already supported by https://github.com/apache/spark/pull/31812 > Support ANSI SQL INTERVAL types > ------------------------------- > > Key: SPARK-27790 > URL: https://issues.apache.org/jira/browse/SPARK-27790 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.2.0 > Reporter: Max Gekk > Assignee: Apache Spark > Priority: Major > > Spark has an INTERVAL data type, but it is “broken”: > # It cannot be persisted > # It is not comparable because it crosses the month day line. That is there > is no telling whether “1 Month 1 Day” is equal to “1 Month 1 Day” since not > all months have the same number of days. > I propose here to introduce the two flavours of INTERVAL as described in the > ANSI SQL Standard and deprecate the Sparks interval type. > * ANSI describes two non overlapping “classes”: > ** YEAR-MONTH, > ** DAY-SECOND ranges > * Members within each class can be compared and sorted. > * Supports datetime arithmetic > * Can be persisted. > The old and new flavors of INTERVAL can coexist until Spark INTERVAL is > eventually retired. Also any semantic “breakage” can be controlled via legacy > config settings. > *Milestone 1* -- Spark Interval equivalency ( The new interval types meet > or exceed all function of the existing SQL Interval): > * Add two new DataType implementations for interval year-month and > day-second. Includes the JSON format and DLL string. > * Infra support: check the caller sides of DateType/TimestampType > * Support the two new interval types in Dataset/UDF. > * Interval literals (with a legacy config to still allow mixed year-month > day-seconds fields and return legacy interval values) > * Interval arithmetic(interval * num, interval / num, interval +/- interval) > * Datetime functions/operators: Datetime - Datetime (to days or day second), > Datetime +/- interval > * Cast to and from the new two interval types, cast string to interval, cast > interval to string (pretty printing), with the SQL syntax to specify the types > * Support sorting intervals. > *Milestone 2* -- Persistence: > * Ability to create tables of type interval > * Ability to write to common file formats such as Parquet and JSON. > * INSERT, SELECT, UPDATE, MERGE > * Discovery > *Milestone 3* -- Client support > * JDBC support > * Hive Thrift server > *Milestone 4* -- PySpark and Spark R integration > * Python UDF can take and return intervals > * DataFrame support -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org