GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/21140

    [SPARK-22600][SQL][WIP] Fix 64kb limit for deeply nested expressions under 
wholestage codegen

    ## What changes were proposed in this pull request?
    
    SPARK-22543 fixes the 64kb compile error for deeply nested expression for 
non-wholestage codegen. This PR extends it to support wholestage codegen.
    
    This patch extracts necessary parameters for a deeply nested expression 
when it is split into a function.
    
    TODO: In the future, this should be extended to `splitExpressions` too to 
automatically extract the current inputs and put them into the parameter list.
    
    WIP: This is in WIP status for now. It brings up previous changes in #19813 
with latest codebase. This will implement the proposal 
https://github.com/apache/spark/pull/19813#issuecomment-354045400 to overcome 
the limit in the previous PR.
    
    ## How was this patch tested?
    
    Added tests and existing tests.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 SPARK-22600-2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21140.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21140
    
----
commit 34abc2284be485c12720437e969cf41394dfc2b5
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-24T02:16:47Z

    Support wholestage codegen for reducing expression codes to prevent 64k 
limit.

commit e0d111e643412c9ca5a53471431643485852ee54
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-25T07:29:24Z

    Merge remote-tracking branch 'upstream/master' into 
reduce-expr-code-for-wholestage

commit 65d07d525344e1d00457d2f538b2ef0b1c38a8e8
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-25T08:09:22Z

    Assert the added test is under wholestage codegen.

commit 9f848be45dcc294d6f27f2c6eaeed1907d36f004
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-27T07:37:02Z

    Put input rows and evaluated columns referred by deferred expressions into 
parameter list.

commit 57b1add4df4648862c76165f8ae10cc487af1221
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-27T08:53:42Z

    Revert unnecessary changes.

commit d051f9eef4d03f9027571419857f690c866dbd98
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-28T02:35:16Z

    Fix subexpression isNull for non nullable case. Fix columnar batch scan's 
rowIdx.

commit 6368702e66948e26c41300da7136dffc5b963cb6
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-28T03:12:22Z

    Let rowidx as global variable instead of early evaluation of column output.

commit 8c7f7496e610fdf4b512c57efd108ccf0238b126
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-28T14:55:46Z

    Fix the problematic case.

commit 7f005158b7b10fb2dc4db3ed15181e68ae33348f
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-29T15:52:55Z

    Fix duplicate parameters.

commit 777eb7a0c4db6695ee993be7b5d3b2d40c161591
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-30T02:00:24Z

    Address comments.

commit 7230997a54babaf62846ab538bb6756b3938d832
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-30T04:11:18Z

    Polish the patch.

commit fd87e9ba324e0b45685e7873884a4fa7a6feaf17
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-30T07:48:36Z

    Add test for new APIs.

commit 57a9fb77d7628e8a5815b8571ca9c99490419252
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-11-30T08:03:09Z

    Generate function parameters if needed.

commit 0d358d635494199582aa6e38fdbeec0f6446c029
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-01T03:37:18Z

    Address comments.

commit aa3db2edca66ab04ecb8fbd54750cbd46544eb1d
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-01T04:58:42Z

    Address comments.

commit 429afbabef6f718870ca3c6caf0712a1e459681f
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-04T09:02:09Z

    Rename variable.

commit 48add652f2df45ce6506f9464c10a6425bd92214
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-05T14:53:11Z

    Address comments.

commit 9443011978c32c611e950a6193f05aa666437f50
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-08T03:41:15Z

    Address comments.

commit 2f4014fe7de0ae634231a5aae36e7272defa3d9e
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-11T14:53:06Z

    Address comments again.

commit 655917cadf86ab17b8a730f282db544cb348d63f
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-12T00:21:14Z

    Remove redundant optimization.

commit c083a7955cd6fb54e0448176d9684496fae48e6f
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-12T00:41:37Z

    Use utility method.

commit 1251dfa305f4f1f8e34d7deb235bfa500d057fb4
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-12T07:56:49Z

    Address comments.

commit c4f15f79f42350ae62ef7452a880cd4ada9ab275
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-12T08:09:22Z

    Move isLiteral and isEvaluated into ExpressionCodegen.

commit f35974e1dfb47387dc952d30a55eee0354bdea63
Author: Liang-Chi Hsieh <viirya@...>
Date:   2017-12-12T14:07:22Z

    Remove useless isLiteral and isEvaluted. Add one more test.

commit e4130436b560d7aa2201d6987b8efc9758161a8e
Author: Liang-Chi Hsieh <viirya@...>
Date:   2018-04-24T09:56:29Z

    Merge remote-tracking branch 'upstream/master' into 
reduce-expr-code-for-wholestage

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to