GitHub user DaveDeCaprio opened a pull request:

    https://github.com/apache/spark/pull/23169

    [SPARK-26103][SQL] Limit the length of debug strings for query plans

    ## What changes were proposed in this pull request?
    
    The PR puts in a limit on the size of a debug string generated for a tree 
node.  Helps to fix out of memory errors when large plans have huge debug 
strings.   In addition to SPARK-26103, this should also address SPARK-23904 and 
SPARK-25380.  AN alternative solution was proposed in #23076, but that solution 
doesn't address all the cases that can cause a large query.  This limit is only 
on calls treeString that don't pass a Writer, which makes it play nicely with 
#22429, #23018 and #23039.  Full plans can be written to files, but truncated 
plans will be used when strings are held in memory, such as for the UI.
    
    - A new configuration parameter called spark.sql.debug.maxPlanLength was 
added to control the length of the plans.
    - When plans are truncated, "..." is printed to indicate that it isn't a 
full plan
    - A warning is printed out the first time a truncated plan is displayed. 
The warning explains what happened and how to adjust the limit.
    
    ## How was this patch tested?
    
    Unit tests were created for the new SizeLimitedWriter.  Also a unit test 
for TreeNode was created that checks that a long plan is correctly truncated.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/DaveDeCaprio/spark text-plan-size

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23169.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23169
    
----
commit 22bd4bddcf4f80d521a27643840f4a3536dac0f3
Author: David DeCaprio <daved@...>
Date:   2018-11-27T19:18:11Z

    Merge pull request #1 from apache/master
    
    merge in spark

commit b7f964d119b5d0ea40896bb86b0110688d8330a8
Author: Dave DeCaprio <daved@...>
Date:   2018-11-28T17:46:14Z

    Added a configurable limit on the maximum length of a plan debug string.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to