Matthew Taylor created SPARK-5923:
-------------------------------------

             Summary: Very slow query when using Oracle  hive metastore and 
table has lots of partitions
                 Key: SPARK-5923
                 URL: https://issues.apache.org/jira/browse/SPARK-5923
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.2.0
            Reporter: Matthew Taylor


This has two aspects
* The direct sql support for oracle is broken in hive 0.13.1. Fails when 
partitions get bigger than 1000 due oracle limitation on IN clause. This cause 
fall back to ORM which is very slow(20 minutes to even start the query)
* Hive it self does not suffer this problem as it passes down to the metadata 
query, filter terms that restrict the partitions returned. SparkSQL is always 
asking for all partitions event if they are not all needed. Even when we 
patched hive it was still taking 2 minutes 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to