Dandandan commented on a change in pull request #9548:
URL: https://github.com/apache/arrow/pull/9548#discussion_r581664638
##########
File path: rust/datafusion/src/physical_plan/repartition.rs
##########
@@ -305,6 +347,33 @@ mod tests {
Ok(())
}
+ #[tokio::test(flavor = "multi_thread")]
+ async fn many_to_many_hash_partition() -> Result<()> {
+ // define input partitions
+ let schema = test_schema();
+ let partition = create_vec_batches(&schema, 50);
+ let partitions = vec![partition.clone(), partition.clone(),
partition.clone()];
+
+ let output_partitions = repartition(
+ &schema,
+ partitions,
+ Partitioning::Hash(
+ vec![Arc::new(crate::physical_plan::expressions::Column::new(
+ &"c0",
+ ))],
+ 8,
+ ),
+ )
+ .await?;
+
+ let total_rows: usize = output_partitions.iter().map(|x|
x.len()).sum();
+
+ assert_eq!(8, output_partitions.len());
Review comment:
Makes sense, but not sure how to do that currently, as it depends on
random state (it could happen that all of them end up on same hash / partition
in a very rare case).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]