[ https://issues.apache.org/jira/browse/HIVE-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated HIVE-7370: ------------------------------ Attachment: (was: spark_1.0.0.patch) > Initial ground work for Hive on Spark [Spark branch] > ---------------------------------------------------- > > Key: HIVE-7370 > URL: https://issues.apache.org/jira/browse/HIVE-7370 > Project: Hive > Issue Type: Task > Components: Spark > Reporter: Xuefu Zhang > Assignee: Xuefu Zhang > Attachments: HIVE-7370.patch, spark_1.0.0.patch > > > Contribute PoC code to Hive on Spark as the ground work for subsequent tasks. > While it has hacks and bad organized code, it will change and more > importantly it allows multiple people to working on different components > concurrently. > With this, simple queries such as "select col from tab where ..." and "select > grp, avg(val) from tab group by grp where ..." can be executed on Spark. > Contents of the patch: > 1. code path for additional execution engine > 2. essential classes such as SparkWork, SparkTask, SparkCompiler, > HiveMapFunction, HiveReduceFunction, SparkClient, etc. > 3. Some code changes to existing classes. > 4. build infrastructure > 5. utility classes. > To try run Hive on Spark, for now you need to have: > 1. self-built Spark 1.0.0 with the patch attached. > 2. invoke Hive client with environment variable MASTER, which points to > master URL of Spark. > 2. set hive.execution.engine=spark > 3. execute supported queries. > NO PRECOMMIT TESTS. This is for spark branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)