GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/21140
[SPARK-22600][SQL][WIP] Fix 64kb limit for deeply nested expressions under wholestage codegen ## What changes were proposed in this pull request? SPARK-22543 fixes the 64kb compile error for deeply nested expression for non-wholestage codegen. This PR extends it to support wholestage codegen. This patch extracts necessary parameters for a deeply nested expression when it is split into a function. TODO: In the future, this should be extended to `splitExpressions` too to automatically extract the current inputs and put them into the parameter list. WIP: This is in WIP status for now. It brings up previous changes in #19813 with latest codebase. This will implement the proposal https://github.com/apache/spark/pull/19813#issuecomment-354045400 to overcome the limit in the previous PR. ## How was this patch tested? Added tests and existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 SPARK-22600-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21140.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21140 ---- commit 34abc2284be485c12720437e969cf41394dfc2b5 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-24T02:16:47Z Support wholestage codegen for reducing expression codes to prevent 64k limit. commit e0d111e643412c9ca5a53471431643485852ee54 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-25T07:29:24Z Merge remote-tracking branch 'upstream/master' into reduce-expr-code-for-wholestage commit 65d07d525344e1d00457d2f538b2ef0b1c38a8e8 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-25T08:09:22Z Assert the added test is under wholestage codegen. commit 9f848be45dcc294d6f27f2c6eaeed1907d36f004 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-27T07:37:02Z Put input rows and evaluated columns referred by deferred expressions into parameter list. commit 57b1add4df4648862c76165f8ae10cc487af1221 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-27T08:53:42Z Revert unnecessary changes. commit d051f9eef4d03f9027571419857f690c866dbd98 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-28T02:35:16Z Fix subexpression isNull for non nullable case. Fix columnar batch scan's rowIdx. commit 6368702e66948e26c41300da7136dffc5b963cb6 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-28T03:12:22Z Let rowidx as global variable instead of early evaluation of column output. commit 8c7f7496e610fdf4b512c57efd108ccf0238b126 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-28T14:55:46Z Fix the problematic case. commit 7f005158b7b10fb2dc4db3ed15181e68ae33348f Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-29T15:52:55Z Fix duplicate parameters. commit 777eb7a0c4db6695ee993be7b5d3b2d40c161591 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-30T02:00:24Z Address comments. commit 7230997a54babaf62846ab538bb6756b3938d832 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-30T04:11:18Z Polish the patch. commit fd87e9ba324e0b45685e7873884a4fa7a6feaf17 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-30T07:48:36Z Add test for new APIs. commit 57a9fb77d7628e8a5815b8571ca9c99490419252 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-11-30T08:03:09Z Generate function parameters if needed. commit 0d358d635494199582aa6e38fdbeec0f6446c029 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-01T03:37:18Z Address comments. commit aa3db2edca66ab04ecb8fbd54750cbd46544eb1d Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-01T04:58:42Z Address comments. commit 429afbabef6f718870ca3c6caf0712a1e459681f Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-04T09:02:09Z Rename variable. commit 48add652f2df45ce6506f9464c10a6425bd92214 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-05T14:53:11Z Address comments. commit 9443011978c32c611e950a6193f05aa666437f50 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-08T03:41:15Z Address comments. commit 2f4014fe7de0ae634231a5aae36e7272defa3d9e Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-11T14:53:06Z Address comments again. commit 655917cadf86ab17b8a730f282db544cb348d63f Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-12T00:21:14Z Remove redundant optimization. commit c083a7955cd6fb54e0448176d9684496fae48e6f Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-12T00:41:37Z Use utility method. commit 1251dfa305f4f1f8e34d7deb235bfa500d057fb4 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-12T07:56:49Z Address comments. commit c4f15f79f42350ae62ef7452a880cd4ada9ab275 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-12T08:09:22Z Move isLiteral and isEvaluated into ExpressionCodegen. commit f35974e1dfb47387dc952d30a55eee0354bdea63 Author: Liang-Chi Hsieh <viirya@...> Date: 2017-12-12T14:07:22Z Remove useless isLiteral and isEvaluted. Add one more test. commit e4130436b560d7aa2201d6987b8efc9758161a8e Author: Liang-Chi Hsieh <viirya@...> Date: 2018-04-24T09:56:29Z Merge remote-tracking branch 'upstream/master' into reduce-expr-code-for-wholestage ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org