Enrico Minack created SPARK-39644: ------------------------------------- Summary: Add RangePartitioning to DataSource V2 Key: SPARK-39644 URL: https://issues.apache.org/jira/browse/SPARK-39644 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 3.4.0 Reporter: Enrico Minack
DataSourceV2 allows data sources to report existing partitioning of read data (org.apache.spark.sql.connector.read.partitioning). Currently, there is only KeyGroupedPartitioning and UnknownPartitioning. Data sources should be able to report global ordered data so that downstream operations can exploit this. The following is required for this to work: - Define RangePartitioning as a new implementation of Partitioning - Add Catalyst rules that handle this partitioning - Add a test source that reports ordering to proof that subsequent operations that require order do not invoke sorting the data. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org