Chao Sun created SPARK-37377:
--------------------------------

             Summary: Refactor V2 Partitioning interface and remove deprecated 
usage of Distribution
                 Key: SPARK-37377
                 URL: https://issues.apache.org/jira/browse/SPARK-37377
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.3.0
            Reporter: Chao Sun


Currently {{Partitioning}} is defined as follow:
{code:scala}
@Evolving
public interface Partitioning {
  int numPartitions();
  boolean satisfy(Distribution distribution);
}
{code}

There are two issues with the interface: 1) it uses a deprecated 
{{Distribution}} interface, and should switch to 
{{org.apache.spark.sql.connector.distributions.Distribution}}. 2) currently 
there is no way to use this in join where we want to compare reported 
partitionings from both sides and decide whether they are "compatible" (and 
thus allows Spark to eliminate shuffle). 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to