Hi Nicolas, what if the table has partitions and sub-partitions? And you do not want to access the entire data?
Regards, Gourav On Sun, Oct 15, 2017 at 12:55 PM, Nicolas Paris <nipari...@gmail.com> wrote: > Le 03 oct. 2017 à 20:08, Nicolas Paris écrivait : > > I wonder the differences accessing HIVE tables in two different ways: > > - with jdbc access > > - with sparkContext > > Well there is also a third way to access the hive data from spark: > - with direct file access (here ORC format) > > > For example: > > val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) > sqlContext.setConf("spark.sql.orc.filterPushdown", "true") > val people = sqlContext.read.format("orc").load("hdfs://cluster//orc_ > people") > people.createOrReplaceTempView("people") > sqlContext.sql("SELECT count(1) FROM people WHERE ...").show() > > > This method looks much faster than both: > - with jdbc access > - with sparkContext > > Any experience on that ? > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >