cloud-fan commented on code in PR #37211:
URL: https://github.com/apache/spark/pull/37211#discussion_r948663939
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala:
##########
@@ -119,13 +119,16 @@ case class DataSourceV2Relation(
* @param output the output attributes of this relation
* @param keyGroupedPartitioning if set, the partitioning expressions that are
used to split the
* rows in the scan across different partitions
- * @param ordering if set, the ordering provided by the scan
+ * @param rangePartitioning if set, the range partitioning expressions that
are used to split the
+ * rows in the scan across different partitions
+ * @param ordering if set, the in-partition ordering provided by the scan
*/
case class DataSourceV2ScanRelation(
relation: DataSourceV2Relation,
scan: Scan,
output: Seq[AttributeReference],
keyGroupedPartitioning: Option[Seq[Expression]] = None,
+ rangePartitioning: Option[Seq[SortOrder]] = None,
Review Comment:
we need to clearly define the semantics here, as there are two sort orders
now. More specially, what if a scan reports both `RangePartitioning` and
ordering? There are a few options:
1. require the reported `RangePartitioning` and ordering to be compatible
2. let `RangePartitioning` only define cross-partitions ordering, while
`SupportsReportOrdering` is for data ordering within each partition.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]