Copilot commented on code in PR #1701:
URL: https://github.com/apache/auron/pull/1701#discussion_r2596904575
##########
native-engine/datafusion-ext-plans/src/joins/test.rs:
##########
@@ -264,6 +264,92 @@ mod tests {
Ok((columns, batches))
}
+ async fn join_collect_with_batch_size(
+ test_type: TestType,
+ left: Arc<dyn ExecutionPlan>,
+ right: Arc<dyn ExecutionPlan>,
+ on: JoinOn,
+ join_type: JoinType,
+ batch_size: usize,
+ ) -> Result<(Vec<String>, Vec<RecordBatch>)> {
+ MemManager::init(1000000);
+ let session_config = SessionConfig::new().with_batch_size(batch_size);
+ let session_ctx = SessionContext::new_with_config(session_config);
+ let session_ctx = SessionContext::new();
Review Comment:
Dead code: The configured `session_ctx` on line 277 is immediately
overwritten on line 278 with a new context that doesn't use the
`session_config`. This means the `batch_size` parameter is never actually
applied to the test, defeating the purpose of this test function.
Remove line 278 to use the correctly configured session context.
```suggestion
```
##########
native-engine/datafusion-ext-plans/src/joins/smj/full_join.rs:
##########
@@ -56,6 +56,10 @@ impl<const L_OUTER: bool, const R_OUTER: bool>
FullJoiner<L_OUTER, R_OUTER> {
self.lindices.len() >= self.join_params.batch_size
}
+ fn has_enough_room(&self, new_size: usize) -> bool {
+ self.lindices.len() + new_size < self.join_params.batch_size
Review Comment:
Off-by-one error: The condition uses `<` but should use `<=` to ensure that
adding `new_size` elements won't exceed `batch_size`. With the current logic,
if `lindices.len() + new_size == batch_size`, the function returns `true` (has
room), but adding all elements would make the buffer exactly equal to
`batch_size`, which violates the intended constraint of staying under the limit.
Change the condition to: `self.lindices.len() + new_size <=
self.join_params.batch_size`
```suggestion
self.lindices.len() + new_size <= self.join_params.batch_size
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]