RE: Running Spark-sql on Hive metastore

2016-01-31 Thread Mich Talebzadeh
sly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor their employees accept any responsibility. From: Xuefu Zhang [mailto:xzh...@cloudera.com] Sent: 01 February 2016 03:05 To: user@hive.ap

Re: Running Spark-sql on Hive metastore

2016-01-31 Thread Xuefu Zhang
For Hive on Spark, there is a startup cost. The second run should be faster. More importantly, it looks like you have 18 map tasks but only your cluster only runs two of them at a time. Thus, you cluster is basically having only two way parallelism. If you configure your cluster to give more capaci

Running Spark-sql on Hive metastore

2016-01-31 Thread Mich Talebzadeh
Hi, * Spark 1.5.2 on Hive 1.2.1 * Hive 1.2.1 on Spark 1.3.1 * Oracle Release 11.2.0.1.0 * Hadoop 2.6 I am running spark-sql using Hive metastore and I am pleasantly surprised by the speed by which Spark performs certain queries on Hive tables. I import