berkaysynnada commented on code in PR #9830:
URL: https://github.com/apache/arrow-datafusion/pull/9830#discussion_r1546081287
##########
datafusion/physical-plan/src/joins/cross_join.rs:
##########
@@ -311,20 +311,27 @@ fn stats_cartesian_product(
}
}
-/// A stream that issues [RecordBatch]es as they arrive from the right of the
join.
+/// A stream that issues [RecordBatch]es as they arrive from the right of the
join.
+/// Right column orders are preserved.
struct CrossJoinStream {
/// Input schema
schema: Arc<Schema>,
- /// future for data from left side
+ /// Future for data from left side
left_fut: OnceFut<JoinLeftData>,
- /// right
+ /// Right stream
right: SendableRecordBatchStream,
- /// Current value on the left
- left_index: usize,
- /// Current batch being processed from the right side
- right_batch: Arc<parking_lot::Mutex<Option<RecordBatch>>>,
- /// join execution metrics
+ /// Join execution metrics
join_metrics: BuildProbeJoinMetrics,
+ /// State information
+ state: CrossJoinStreamState,
+ /// Left data
+ left_data: Vec<RecordBatch>,
+ /// Current right batch
+ right_batch: RecordBatch,
+ /// Indexes the next processed build side batch
+ left_batch_index: usize,
Review Comment:
`build_batch` also uses them. I have tried it now but the current version
seems more simple rather than adding them to function parameters as `&mut` or
unwrapping the state to extract them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]