GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/20521

    [SPARK-22977][SQL] fix web UI SQL tab for CTAS

    ## What changes were proposed in this pull request?
    
    This is a regression in Spark 2.3.
    
    In Spark 2.2, we have a fragile UI support for SQL data writing commands. 
We only track the input query plan of `FileFormatWriter` and display its 
metrics. This is not ideal because we don't know who triggered the writing(can 
be table insertion, CTAS, etc.), but it's still useful to see the metrics of 
the input query.
    
    In Spark 2.3, we introduced a new mechanism: `DataWritigCommand`, to fix 
the UI issue entirely. Now these writing commands have real children, and we 
don't need to hack into the `FileFormatWriter` for the UI. This also helps with 
`explain`, now `explain` can show the physical plan of the input query, while 
in 2.2 the physical writing plan is simply `ExecutedCommandExec` and it has no 
child.
    
    However there is a regression in CTAS. CTAS commands don't extend 
`DataWritigCommand`, and we don't have the UI hack in `FileFormatWriter` 
anymore, so the UI for CTAS is just an empty node. See 
https://issues.apache.org/jira/browse/SPARK-22977 for more information about 
this UI issue.
    
    To fix it, we should apply the `DataWritigCommand` mechanism to CTAS 
commands.
    
    TODO: In the future, we should refactor this part and create some physical 
layer code pieces for data writing, and reuse them in different writing 
commands. We should have different logical nodes for different operators, even 
some of them share some same logic, e.g. CTAS, CREATE TABLE, INSERT TABLE. 
Internally we can share the same physical logic.
    
    ## How was this patch tested?
    
    manually tested.
    For data source table
    <img width="644" alt="1" 
src="https://user-images.githubusercontent.com/3182036/35874155-bdffab28-0ba6-11e8-94a8-e32e106ba069.png";>
    For hive table
    <img width="666" alt="2" 
src="https://user-images.githubusercontent.com/3182036/35874161-c437e2a8-0ba6-11e8-98ed-7930f01432c5.png";>
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark UI

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20521.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20521
    
----
commit b90c8f3297d11471c6393b91f7ed5c8e52735f7f
Author: Wenchen Fan <wenchen@...>
Date:   2018-02-06T16:59:58Z

    fix web UI SQL tab for CTAS

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to