Astha Arya created SPARK-22189:
----------------------------------
Summary: Number of jobs created while querying partitioned table
in hive using spark
Key: SPARK-22189
URL: https://issues.apache.org/jira/browse/SPARK-22189
Project: Spark
Issue Type: Question
Components: SQL
Affects Versions: 1.6.0
Reporter: Astha Arya
I am using Spark SQL
Spark version - 1.6.0
Hive 1.1.0-cdh5.9.0
When I run hiveContext.sql, creates 2 another job for my case i.e. 3 jobs in
total for querying hive for a partitioned table. Whereas when i run the same
query on hive using spark as execution engine, it makes only one job.
Also, the driver logs show that it lists all the partitions which most likely
shouldnt happen because it slows down my execution.
Is this a bug? Is there any way to reduce the number of jobs and also not list
all the partitions each time I query the same table ?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]