Matthew Taylor created SPARK-5923: ------------------------------------- Summary: Very slow query when using Oracle hive metastore and table has lots of partitions Key: SPARK-5923 URL: https://issues.apache.org/jira/browse/SPARK-5923 Project: Spark Issue Type: Bug Affects Versions: 1.2.0 Reporter: Matthew Taylor
This has two aspects * The direct sql support for oracle is broken in hive 0.13.1. Fails when partitions get bigger than 1000 due oracle limitation on IN clause. This cause fall back to ORM which is very slow(20 minutes to even start the query) * Hive it self does not suffer this problem as it passes down to the metadata query, filter terms that restrict the partitions returned. SparkSQL is always asking for all partitions event if they are not all needed. Even when we patched hive it was still taking 2 minutes -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org