[jira] [Updated] (SPARK-45784) Introduce clustering mechanism to Spark

Terry Kim (Jira) Fri, 03 Nov 2023 16:04:10 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-45784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Terry Kim updated SPARK-45784:
------------------------------
    Description: This proposes to introduce a clustering mechanism such that 
different data sources (e.g., Delta, Iceberg, etc.) can implement format 
specific clustering.  (was: This proposes to introduce CLUSTER BY clause to 
CREATE/REPLACE SQL syntax:
{code:java}
CREATE TABLE tbl(a int, b string) CLUSTER BY (a, b){code}
There will not be an implementation, but it's up to the catalog implementation 
to utilize the clustering information (e.g., Delta, Iceberg, etc.).

Note that specifying CLUSTER BY will throw an exception if the table being 
created is for v1 source or session catalog (e.g., v2 source w/ session 
catalog).)

> Introduce clustering mechanism to Spark
> ---------------------------------------
>
>                 Key: SPARK-45784
>                 URL: https://issues.apache.org/jira/browse/SPARK-45784
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Terry Kim
>            Priority: Major
>
> This proposes to introduce a clustering mechanism such that different data 
> sources (e.g., Delta, Iceberg, etc.) can implement format specific clustering.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-45784) Introduce clustering mechanism to Spark

Reply via email to