2010YOUY01 commented on code in PR #19893:
URL: https://github.com/apache/datafusion/pull/19893#discussion_r2787434025
##########
datafusion/physical-plan/src/joins/hash_join/exec.rs:
##########
@@ -687,26 +797,14 @@ impl HashJoinExec {
/// Return new instance of [HashJoinExec] with the given projection.
pub fn with_projection(&self, projection: Option<Vec<usize>>) ->
Result<Self> {
+ let projection = projection.map(Into::into);
// check if the projection is valid
- can_project(&self.schema(), projection.as_ref())?;
- let projection = match projection {
- Some(projection) => match &self.projection {
- Some(p) => Some(projection.iter().map(|i| p[*i]).collect()),
- None => Some(projection),
- },
- None => None,
- };
- Self::try_new(
- Arc::clone(&self.left),
- Arc::clone(&self.right),
- self.on.clone(),
- self.filter.clone(),
- &self.join_type,
- projection,
- self.mode,
- self.null_equality,
- self.null_aware,
- )
+ can_project(&self.schema(), projection.as_deref())?;
+ let projection =
+ combine_projections(projection.as_ref(),
self.projection.as_ref())?;
+ HashJoinExecBuilder::from(self)
Review Comment:
I find this builder pattern a bit risky. Currently,
`HashJoinExecBuilder::from(self)` clones some fields from `self` while
implicitly resetting others (for example, the dynamic filter).
This behavior may be surprising for certain transformations. It might be
safer to make such partial resets explicit, so callers are clearly aware of
which fields are preserved and which are dropped.
Would it be possible to remove these builders and require all cloning to be
explicit instead? Since immutable fields are already wrapped in `Arc`, planning
performance should not be affected. The main downside I see is increased
verbosity.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]