GitHub user Achuth17 opened a pull request:

    https://github.com/apache/spark/pull/21608

    [SPARK-24626] [SQL] Improve Analyze Table command

    ## What changes were proposed in this pull request?
    
    Currently, Analyze table calculates table size sequentially for each 
partition. We can parallelize size calculations over partitions.
    
    ## How was this patch tested?
    
    Manual test.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Achuth17/spark improveAnalyze

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21608.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21608
    
----
commit 700790132bc5b672e07d1dbb3472c50bae519939
Author: arajagopal17 <arajagopal@...>
Date:   2018-06-21T08:21:04Z

    Init changes

commit 82d5ef3f76477bd03fe50a7c3a4c8b6bc6182e13
Author: arajagopal17 <arajagopal@...>
Date:   2018-06-21T22:05:58Z

    Removing logs.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to