[
https://issues.apache.org/jira/browse/SPARK-20487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiao Li resolved SPARK-20487.
-----------------------------
Resolution: Fixed
Assignee: Tejas Patil
Fix Version/s: 2.2.0
> `HiveTableScan` node is quite verbose in explained plan
> -------------------------------------------------------
>
> Key: SPARK-20487
> URL: https://issues.apache.org/jira/browse/SPARK-20487
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.1.0
> Reporter: Tejas Patil
> Assignee: Tejas Patil
> Priority: Trivial
> Fix For: 2.2.0
>
>
> For hive tables, `explain()` prints a lot of information. This makes it hard
> to read the plan (esp. for large sql strings with numerous tables).
> eg.
> {noformat}
> scala> hc.sql(" SELECT * FROM my_table WHERE name = 'foo' ").explain(true)
> == Parsed Logical Plan ==
> 'Project [*]
> +- 'Filter ('name = foo)
> +- 'UnresolvedRelation `my_table`
> == Analyzed Logical Plan ==
> user_id: bigint, name: string, ds: string
> Project [user_id#13L, name#14, ds#15]
> +- Filter (name#14 = foo)
> +- SubqueryAlias my_table
> +- CatalogRelation CatalogTable(
> Database: default
> Table: my_table
> Owner: tejasp
> Created: Fri Apr 14 17:05:50 PDT 2017
> Last Access: Wed Dec 31 16:00:00 PST 1969
> Type: MANAGED
> Provider: hive
> Properties: [serialization.format=1]
> Statistics: 9223372036854775807 bytes
> Location: file:/tmp/warehouse/my_table
> Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat: org.apache.hadoop.mapred.TextInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Partition Provider: Catalog
> Partition Columns: [`ds`]
> Schema: root
> -- user_id: long (nullable = true)
> -- name: string (nullable = true)
> -- ds: string (nullable = true)
> ), [user_id#13L, name#14], [ds#15]
> == Optimized Logical Plan ==
> Filter (isnotnull(name#14) && (name#14 = foo))
> +- CatalogRelation CatalogTable(
> Database: default
> Table: my_table
> Owner: tejasp
> Created: Fri Apr 14 17:05:50 PDT 2017
> Last Access: Wed Dec 31 16:00:00 PST 1969
> Type: MANAGED
> Provider: hive
> Properties: [serialization.format=1]
> Statistics: 9223372036854775807 bytes
> Location: file:/tmp/warehouse/my_table
> Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat: org.apache.hadoop.mapred.TextInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Partition Provider: Catalog
> Partition Columns: [`ds`]
> Schema: root
> -- user_id: long (nullable = true)
> -- name: string (nullable = true)
> -- ds: string (nullable = true)
> ), [user_id#13L, name#14], [ds#15]
> == Physical Plan ==
> *Filter (isnotnull(name#14) && (name#14 = foo))
> +- HiveTableScan [user_id#13L, name#14, ds#15], CatalogRelation CatalogTable(
> Database: default
> Table: my_table
> Owner: tejasp
> Created: Fri Apr 14 17:05:50 PDT 2017
> Last Access: Wed Dec 31 16:00:00 PST 1969
> Type: MANAGED
> Provider: hive
> Properties: [serialization.format=1]
> Statistics: 9223372036854775807 bytes
> Location: file:/tmp/warehouse/my_table
> Serde Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat: org.apache.hadoop.mapred.TextInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Partition Provider: Catalog
> Partition Columns: [`ds`]
> Schema: root
> -- user_id: long (nullable = true)
> -- name: string (nullable = true)
> -- ds: string (nullable = true)
> ), [user_id#13L, name#14], [ds#15]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]