zhztheplayer commented on PR #5750: URL: https://github.com/apache/incubator-gluten/pull/5750#issuecomment-2128498237
> Both of them can be ( and should be ?) moved to ColumnOverrides. If the custom strategy can be removed by moving the code to ColumnarOverrides (without more workarounds), I will be inclined to do that since it: 1. Simplifies code 2. Creates "more vanilla" plan when the join operators are falling back > why is the design forced to use ShuffledHashJoin instead of judging based on the threshold? It was mainly because hash join is faster / more well-maintained than sort join in Velox, if I remember it correctly. > If the left and right tables are both very large, will it not cause velox memory problems? This is very possible. Velox's grace hash join has maximum spill level (be default 4) so OOM will happen for huge input. We may need to finally provide production-ready SMJ support in Gluten. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
