rednaxelafx commented on issue #25642: [SPARK-28916][SQL] Split subexpression 
elimination functions code
URL: https://github.com/apache/spark/pull/25642#issuecomment-527763452
 
 
   The PR as is it now is one level better than the status quo. That's probably 
good enough. But I was curious whether or not it makes more sense to perform a 
tree-splitting instead of a fixed-level splitting.
   
   Basically, `CodegenContext.splitExpressions` only performs a fixed one-level 
splitting, so it splits
   ```
   orig_func() {
     expr1
     expr2
     expr3
     expr4
     expr5
     expr6
   }
   ```
   into something like the following, assuming the split threshold is an 
imaginary `2` expressions:
   ```
   func1() { expr1; expr2 }
   func2() { expe3; expr4 }
   func3() { expr5; expr6 }
   
   orig_func() {
     func1()
     func2()
     func3()
   }
   ```
   Now, given that we assume the split threshold is `2`, after the split this 
top-level code is still above the split threshold, which is not good.
   
   Instead, it'd be really nice if the `splitExpressions` utility method can 
perform tree splitting within itself, without a 1- or 2-level fixed depth split 
limit.
   
   Doing so would help us be more likely to cap the codegen method size below 
not only 64KB but also some lower thresholds like 8KB.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to