[ https://issues.apache.org/jira/browse/SPARK-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Armbrust resolved SPARK-14070. -------------------------------------- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11891 [https://github.com/apache/spark/pull/11891] > Use ORC data source for SQL queries on ORC tables > ------------------------------------------------- > > Key: SPARK-14070 > URL: https://issues.apache.org/jira/browse/SPARK-14070 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 1.6.1 > Reporter: Tejas Patil > Assignee: Tejas Patil > Priority: Minor > Fix For: 2.0.0 > > > Currently if one is trying to query ORC tables in Hive, the plan generated by > Spark hows that its using the `HiveTableScan` operator which is generic to > all file formats. We could instead use the ORC data source for this so that > we can get ORC specific optimizations like predicate pushdown. > Current behaviour: > ``` > scala> hqlContext.sql("SELECT * FROM orc_table").explain(true) > == Parsed Logical Plan == > 'Project [unresolvedalias(*, None)] > +- 'UnresolvedRelation `orc_table`, None > == Analyzed Logical Plan == > key: string, value: string > Project [key#171,value#172] > +- MetastoreRelation default, orc_table, None > == Optimized Logical Plan == > MetastoreRelation default, orc_table, None > == Physical Plan == > HiveTableScan [key#171,value#172], MetastoreRelation default, orc_table, None > ``` -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org