To add on what Vikash said above, bit more internals : 1. There are 2 components which work together to achieve Hive + Spark integration a. HiveContext which extends SqlContext adds logic to add hive specific things e.g. loading jars to talk to underlying metastore db, load configs in hive-site.xml b. HiveThriftServer2 which uses native HiveServer2 and add logic for creating sessions, handling operations. 2. Once thrift server is up , authentication , session management is all delegated to Hive classes. Once parsing of query is done and logical plan is created and passed on to create DataFrame.
So no mapReduce , spark intelligently uses needed pieces from Hive and use its own execution engine. --Regards, Lalit On Wed, Jun 8, 2016 at 9:59 PM, Vikash Pareek <vikash.par...@infoobjects.com > wrote: > Himanshu, > > Spark doesn't use hive execution engine (Map Reduce) to execute query. > Spark > only reads the meta data from hive meta store db and executes the query > within Spark execution engine. This meta data is used by Spark's own SQL > execution engine (this includes components such as catalyst, tungsten to > optimize queries) to execute query and generate result faster than hive > (Map > Reduce). > > Using HiveContext means connecting to hive meta store db. Thus, HiveContext > can access hive meta data, and hive meta data includes location of data, > serialization and de-serializations, compression codecs, columns, datatypes > etc. thus, Spark have enough information about the hive tables and it's > data > to understand the target data and execute the query over its on execution > engine. > > Overall, Spark replaced the Map Reduce model completely by it's > in-memory(RDD) computation engine. > > - Vikash Pareek > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/When-queried-through-hiveContext-does-hive-executes-these-queries-using-its-execution-engine-default-tp27114p27117.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >