Re: spark-submit does not use hive-site.xml

Cheng Lian Wed, 10 Jun 2015 11:07:26 -0700

Thanks for pointing out the documentation error :) Openedhttps://github.com/apache/spark/pull/6749 to fix this.


On 6/11/15 1:18 AM, James Pirz wrote:

Thanks for your help !
Switching to HiveContext fixed the issue.


Just one side comment:

In the documentation regarding Hive Tables and HiveContext<https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables>,we see:

|// sc is an existing JavaSparkContext.
HiveContext  sqlContext  =  new  org.apache.spark.sql.hive.HiveContext(sc);|

But this is incorrect as the constructor in HiveContext does notaccept a JavaSparkContext, but a SparkContext. (the comment isbasically misleading). The correct code snippet should be:


|HiveContext  sqlContext  =  new  org.apache.spark.sql.hive.HiveContext(sc.sc  
<http://sc.sc>());|

Thanks again for your help.

On Wed, Jun 10, 2015 at 1:17 AM, Cheng Lian <lian.cs....@gmail.com<mailto:lian.cs....@gmail.com>> wrote:


    Hm, this is a common confusion... Although the variable name is
    `sqlContext` in Spark shell, it's actually a `HiveContext`, which
    extends `SQLContext` and has the ability to communicate with Hive
    metastore.

    So your program need to instantiate a
    `org.apache.spark.sql.hive.HiveContext` instead.

    Cheng


    On 6/10/15 10:19 AM, James Pirz wrote:

    I am using Spark (standalone) to run queries (from a remote
    client) against data in tables that are already defined/loaded in
    Hive.

    I have started metastore service in Hive successfully, and by
    putting hive-site.xml, with proper metastore.uri, in
    $SPARK_HOME/conf directory, I tried to share its config with spark.

    When I start spark-shell, it gives me a default sqlContext, and I
    can use that to access my Hive's tables with no problem.

    But once I submit a similar query via Spark application through
    'spark-submit', it does not see the tables and it seems it does
    not pick hive-site.xml which is under conf directory in Spark's
    home. I tried to use '--files' argument with spark-submit to pass
    "hive-site.xml' to the workers, but it did not change anything.

    Here is how I try to run the application:

    $SPARK_HOME/bin/spark-submit --class "SimpleClient" --master
    spark://my-spark-master:7077
    --files=$SPARK_HOME/conf/hive-site.xml  simple-sql-client-1.0.jar

    Here is the simple example code that I try to run (in Java):

    SparkConf conf = new SparkConf().setAppName("Simple SQL Client");

    JavaSparkContext sc = new JavaSparkContext(conf);

    SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);

    DataFrame res = sqlContext.sql("show tables");

    res.show();



    Here are the SW versions:
    Spark: 1.3
    Hive: 1.2
    Hadoop: 2.6

    Thanks in advance for any suggestion.

Re: spark-submit does not use hive-site.xml

Reply via email to