alamb commented on code in PR #8658:
URL: https://github.com/apache/arrow-datafusion/pull/8658#discussion_r1437865194


##########
datafusion/physical-plan/src/joins/hash_join.rs:
##########
@@ -156,8 +155,48 @@ impl JoinLeftData {
 ///
 /// Execution proceeds in 2 stages:
 ///
-/// 1. the **build phase** where a hash table is created from the tuples of the
-/// build side.
+/// 1. the **build phase** creates a hash table from the tuples of the build 
side,

Review Comment:
   thank you for this



##########
datafusion/physical-plan/src/joins/utils.rs:
##########
@@ -151,6 +151,82 @@ pub trait JoinHashMapType {
     fn get_map(&self) -> &RawTable<(u64, u64)>;
     /// Returns a reference to the next.
     fn get_list(&self) -> &Self::NextType;
+

Review Comment:
   `JoinHashMap` and this trait are getting big enough maybe to warrant moving 
to its own module (in a separate PR, perhaps)



##########
datafusion/physical-plan/src/joins/hash_join.rs:
##########
@@ -1635,12 +1652,13 @@ mod tests {
             "| a1 | b1 | c1 | a2 | b2 | c2 |",
             "+----+----+----+----+----+----+",
             "| 1  | 4  | 7  | 10 | 4  | 70 |",
-            "| 2  | 5  | 8  | 20 | 5  | 80 |",
             "| 3  | 5  | 9  | 20 | 5  | 80 |",
+            "| 2  | 5  | 8  | 20 | 5  | 80 |",

Review Comment:
   this also is neat to see it keep the same order as the input



##########
datafusion/physical-plan/src/joins/hash_join.rs:
##########
@@ -1558,7 +1573,9 @@ mod tests {
             "| 3  | 5  | 9  | 20 | 5  | 80 |",
             "+----+----+----+----+----+----+",
         ];
-        assert_batches_sorted_eq!(expected, &batches);
+
+        // Inner join output is expected to preserve both inputs order

Review Comment:
   👍 



##########
datafusion/physical-plan/src/joins/hash_join.rs:
##########
@@ -1611,9 +1628,9 @@ mod tests {
     async fn join_inner_one_no_shared_column_names() -> Result<()> {
         let task_ctx = Arc::new(TaskContext::default());
         let left = build_table(
-            ("a1", &vec![1, 2, 3]),
+            ("a1", &vec![1, 3, 2]),

Review Comment:
   why was this test changed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to