[ 
https://issues.apache.org/jira/browse/SPARK-27790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304965#comment-17304965
 ] 

Max Gekk commented on SPARK-27790:
----------------------------------

> how will DateTime addition work with the new intervals?  Something like this?

[~mrpowers] You are right. We should add such functions(constructors) for both 
new types. The arithmetic operations have been already supported by 
https://github.com/apache/spark/pull/31832 and 
https://github.com/apache/spark/pull/31855. Regarding to the existing function 
make_interval(), maybe we could add a rule (under a SQL flag) which splits it 
to make_year_month_interval() and make_date_time_interval().

> Does ANSI SQL allow operations on dates using the YEAR-MONTH interval type?

[~simeons] As far as I know, yes, it does. This has been already supported by 
https://github.com/apache/spark/pull/31812

> Support ANSI SQL INTERVAL types
> -------------------------------
>
>                 Key: SPARK-27790
>                 URL: https://issues.apache.org/jira/browse/SPARK-27790
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: Max Gekk
>            Assignee: Apache Spark
>            Priority: Major
>
> Spark has an INTERVAL data type, but it is “broken”:
> # It cannot be persisted
> # It is not comparable because it crosses the month day line. That is there 
> is no telling whether “1 Month 1 Day” is equal to “1 Month 1 Day” since not 
> all months have the same number of days.
> I propose here to introduce the two flavours of INTERVAL as described in the 
> ANSI SQL Standard and deprecate the Sparks interval type.
> * ANSI describes two non overlapping “classes”: 
> ** YEAR-MONTH, 
> ** DAY-SECOND ranges
> * Members within each class can be compared and sorted.
> * Supports datetime arithmetic
> * Can be persisted.
> The old and new flavors of INTERVAL can coexist until Spark INTERVAL is 
> eventually retired. Also any semantic “breakage” can be controlled via legacy 
> config settings. 
> *Milestone 1* --  Spark Interval equivalency (   The new interval types meet 
> or exceed all function of the existing SQL Interval):
> * Add two new DataType implementations for interval year-month and 
> day-second. Includes the JSON format and DLL string.
> * Infra support: check the caller sides of DateType/TimestampType
> * Support the two new interval types in Dataset/UDF.
> * Interval literals (with a legacy config to still allow mixed year-month 
> day-seconds fields and return legacy interval values)
> * Interval arithmetic(interval * num, interval / num, interval +/- interval)
> * Datetime functions/operators: Datetime - Datetime (to days or day second), 
> Datetime +/- interval
> * Cast to and from the new two interval types, cast string to interval, cast 
> interval to string (pretty printing), with the SQL syntax to specify the types
> * Support sorting intervals.
> *Milestone 2* -- Persistence:
> * Ability to create tables of type interval
> * Ability to write to common file formats such as Parquet and JSON.
> * INSERT, SELECT, UPDATE, MERGE
> * Discovery
> *Milestone 3* --  Client support
> * JDBC support
> * Hive Thrift server
> *Milestone 4* -- PySpark and Spark R integration
> * Python UDF can take and return intervals
> * DataFrame support



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to