alamb commented on code in PR #5754:
URL: https://github.com/apache/arrow-datafusion/pull/5754#discussion_r1155944850
##########
datafusion/core/src/execution/context.rs:
##########
@@ -1293,9 +1293,6 @@ impl SessionState {
// repartitioning and local sorting steps to meet distribution and
ordering requirements.
// Therefore, it should run before EnforceDistribution and
EnforceSorting.
Arc::new(JoinSelection::new()),
- // Enforce sort before PipelineFixer
Review Comment:
👍
##########
datafusion/common/src/config.rs:
##########
@@ -280,6 +280,10 @@ config_namespace! {
/// using the provided `target_partitions` level
pub repartition_joins: bool, default = true
+ /// Should DataFusion allow symmetric hash joins for unbounded data
sources even when
+ /// its inputs do not have any ordering or filtering
+ pub allow_symmetric_joins_without_pruning: bool, default = true
Review Comment:
I don't understand how a symmetric hash join could generate correct results
when the inputs don't have any ordering 🤔 Maybe we can add some additional
comments about under what circumstances one would enable
/ disable this option.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]