rednaxelafx commented on issue #25642: [SPARK-28916][SQL] Split subexpression elimination functions code URL: https://github.com/apache/spark/pull/25642#issuecomment-527763452 The PR as is it now is one level better than the status quo. That's probably good enough. But I was curious whether or not it makes more sense to perform a tree-splitting instead of a fixed-level splitting. Basically, `CodegenContext.splitExpressions` only performs a fixed one-level splitting, so it splits ``` orig_func() { expr1 expr2 expr3 expr4 expr5 expr6 } ``` into something like the following, assuming the split threshold is an imaginary `2` expressions: ``` func1() { expr1; expr2 } func2() { expe3; expr4 } func3() { expr5; expr6 } orig_func() { func1() func2() func3() } ``` Now, given that we assume the split threshold is `2`, after the split this top-level code is still above the split threshold, which is not good. Instead, it'd be really nice if the `splitExpressions` utility method can perform tree splitting within itself, without a 1- or 2-level fixed depth split limit. Doing so would help us be more likely to cap the codegen method size below not only 64KB but also some lower thresholds like 8KB.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
