dilipbiswal commented on a change in pull request #24385: [SPARK-27480][SQL]
Improve `EXPLAIN DESC QUERY` to show the input SQL statement
URL: https://github.com/apache/spark/pull/24385#discussion_r277099053
##########
File path: sql/core/src/test/resources/sql-tests/results/describe-query.sql.out
##########
@@ -154,16 +154,46 @@ DESCRIBE
-- !query 15
-DROP TABLE desc_temp1
+EXPLAIN DESC QUERY SELECT * FROM desc_temp2 WHERE key > 0
-- !query 15 schema
-struct<>
+struct<plan:string>
-- !query 15 output
-
+== Physical Plan ==
+Execute DescribeQueryCommand
+ +- DescribeQueryCommand SELECT * FROM desc_temp2 WHERE key > 0
-- !query 16
-DROP TABLE desc_temp2
+EXPLAIN EXTENDED DESC WITH s AS (SELECT 'hello' as col1) SELECT * FROM s
-- !query 16 schema
-struct<>
+struct<plan:string>
-- !query 16 output
+== Parsed Logical Plan ==
+DescribeQueryCommand WITH s AS (SELECT 'hello' as col1) SELECT * FROM s
+
+== Analyzed Logical Plan ==
+col_name: string, data_type: string, comment: string
+DescribeQueryCommand WITH s AS (SELECT 'hello' as col1) SELECT * FROM s
+
+== Optimized Logical Plan ==
+DescribeQueryCommand WITH s AS (SELECT 'hello' as col1) SELECT * FROM s
+
+== Physical Plan ==
+Execute DescribeQueryCommand
+ +- DescribeQueryCommand WITH s AS (SELECT 'hello' as col1) SELECT * FROM s
Review comment:
@dongjoon-hyun I thought about this some more. Actually i think there is an
ambiguity whether the "surrounding" comments are part of the explain statement
or the query. In order to verify the theory, i tried the following :
```sql
spark-sql> create view vtest as /* leading */ select /* embeded */ 1 FROM
VALUES(1) /* trailing */;
spark-sql> show table extended like 'vtest';
default vtest false Database: default
Table: vtest
Owner: dbiswal
Created Time: Fri Apr 19 14:36:32 PDT 2019
Last Access: Wed Dec 31 16:00:00 PST 1969
Created By: Spark 3.0.0-SNAPSHOT
Type: VIEW
View Text: select /* embeded */ 1 FROM VALUES(1)
View Original Text: select /* embeded */ 1 FROM VALUES(1)
View Default Database: default
View Query Output Columns: [1]
Table Properties: [transient_lastDdlTime=1555709792, view.query.out.col.0=1,
view.query.out.numCols=1, view.default.database=default]
Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.SequenceFileInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Storage Properties: [serialization.format=1]
Schema: root
|-- 1: integer (nullable = false)
```
As we see here the original query does not contain the surrounding comments.
Given this, i think we would be okay in dropping the surrounding comments and
preserving the embedded comments. What do you think ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]