wiedld commented on code in PR #15409:
URL: https://github.com/apache/datafusion/pull/15409#discussion_r2076020887


##########
datafusion/datasource/src/memory.rs:
##########
@@ -723,6 +761,222 @@ impl MemorySourceConfig {
     pub fn original_schema(&self) -> SchemaRef {
         Arc::clone(&self.schema)
     }
+
+    /// Repartition while preserving order.
+    ///
+    /// Returns `Ok(None)` if cannot fulfill the requested repartitioning, such
+    /// as having too few batches to fulfill the `target_partitions` or if 
unable
+    /// to preserve output ordering.
+    fn repartition_preserving_order(
+        &self,
+        target_partitions: usize,
+        output_ordering: LexOrdering,
+    ) -> Result<Option<Vec<Vec<RecordBatch>>>> {
+        if !self.eq_properties().ordering_satisfy(&output_ordering) {
+            Ok(None)
+        } else if self.partitions.len() == 1 {
+            self.repartition_evenly_by_size(target_partitions)
+        } else {

Review Comment:
   Thank you for the code review.
   
   After adding an additional test case targeting the `self.partitions.len() == 
1`, and ended up removing this conditional due to this: 
https://github.com/apache/datafusion/pull/15409/commits/57147add2bb1d540334b08639504b80930c1c982
   
   Additionally, I added code comments about how the max heap is being used: 
https://github.com/apache/datafusion/pull/15409/commits/12e6dfb1dec6038b47a63019ee6965ed626bc68e
   
   I believe the `RePartition` struct already documents what prop is the max 
heap cmp, vs ordering.
   
   Please let me know if you spot anything else. Thank you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to