xinzhang created SPARK-22007:
--------------------------------
Summary: spark-submit on yarn or local , got different result
Key: SPARK-22007
URL: https://issues.apache.org/jira/browse/SPARK-22007
Project: Spark
Issue Type: Bug
Components: Spark Core, Spark Shell, Spark Submit
Affects Versions: 2.1.0
Reporter: xinzhang
submit the py script on local.
/opt/spark/spark-bin/bin/spark-submit --master local cluster test_hive.py
result:
+------------+
|databaseName|
+------------+
| default|
| zzzz|
| xxxxx|
+------------+
submit the py script on yarn.
/opt/spark/spark-bin/bin/spark-submit --master yarn --deploy-mode cluster
test_hive.py
result:
+------------+
|databaseName|
+------------+
| default|
+------------+
the py script :
[yangtt@dc-gateway119 test]$ cat test_hive.py
#!/usr/bin/env python
#coding=utf-8
from os.path import expanduser, join, abspath
from pyspark.sql import SparkSession
from pyspark.sql import Row
from pyspark.conf import SparkConf
def squared(s):
return s * s
# warehouse_location points to the default location for managed databases and
tables
warehouse_location = abspath('/group/user/yangtt/meta/hive-temp-table')
spark = SparkSession \
.builder \
.appName("Python_Spark_SQL_Hive") \
.config("spark.sql.warehouse.dir", warehouse_location) \
.config(conf=SparkConf()) \
.enableHiveSupport() \
.getOrCreate()
spark.udf.register("squared",squared)
spark.sql("show databases").show()
Q:why the spark load the different hive metastore
the yarn always use the DERBY?
17/09/14 16:10:55 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is
DERBY
my current metastore is in mysql.
any suggest will be helpful.
thanks.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]