hive on spark query error

2015-09-25 Thread Garry Chen
Hi All,
I am following 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started?
 to setup hive on spark.  After setup/configuration everything startup I am 
able to show tables but when executing sql statement within beeline I got 
error.  Please help and thank you very much.

Cluster Environment (3 nodes) as following
hadoop-2.7.1
spark-1.4.1-bin-hadoop2.6
zookeeper-3.4.6
apache-hive-1.2.1-bin

Error from hive log:
2015-09-25 11:51:03,123 INFO  [HiveServer2-Handler-Pool: Thread-50]: 
client.SparkClientImpl (SparkClientImpl.java:startDriver(375)) - Attempting 
impersonation of oracle
2015-09-25 11:51:03,133 INFO  [HiveServer2-Handler-Pool: Thread-50]: 
client.SparkClientImpl (SparkClientImpl.java:startDriver(409)) - Running client 
driver with argv: /u01/app/spark-1.4.1-bin-hadoop2.6/bin/spark-submit 
--proxy-user oracle --properties-file 
/tmp/spark-submit.840692098393819749.properties --class 
org.apache.hive.spark.client.RemoteDriver 
/u01/app/apache-hive-1.2.1-bin/lib/hive-exec-1.2.1.jar --remote-host 
ip-10-92-82-229.ec2.internal --remote-port 40476 --conf 
hive.spark.client.connect.timeout=1000 --conf 
hive.spark.client.server.connect.timeout=9 --conf 
hive.spark.client.channel.log.level=null --conf 
hive.spark.client.rpc.max.size=52428800 --conf hive.spark.client.rpc.threads=8 
--conf hive.spark.client.secret.bits=256
2015-09-25 11:51:03,867 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.server.connect.timeout=9
2015-09-25 11:51:03,868 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.rpc.threads=8
2015-09-25 11:51:03,868 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.connect.timeout=1000
2015-09-25 11:51:03,868 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.secret.bits=256
2015-09-25 11:51:03,868 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.rpc.max.size=52428800
2015-09-25 11:51:03,876 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Error: Master must start with yarn, spark, 
mesos, or local
2015-09-25 11:51:03,876 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Run with --help for usage help or --verbose 
for debug output
2015-09-25 11:51:03,885 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - 15/09/25 11:51:03 INFO util.Utils: Shutdown 
hook called
2015-09-25 11:51:03,889 WARN  [Driver]: client.SparkClientImpl 
(SparkClientImpl.java:run(427)) - Child process exited with code 1.




RE: hive on spark query error

2015-09-25 Thread Garry Chen
Yes you are right.  Make the change and also link hive-site.xml into spark conf 
directory.  Rerun the sql getting error in hive.log

2015-09-25 13:31:14,750 INFO  [HiveServer2-Handler-Pool: Thread-125]: 
client.SparkClientImpl (SparkClientImpl.java:startDriver(375)) - Attempting 
impersonation of HIVEAPP
2015-09-25 13:31:14,750 INFO  [HiveServer2-Handler-Pool: Thread-125]: 
client.SparkClientImpl (SparkClientImpl.java:startDriver(409)) - Running client 
driver with argv: /u01/app/spark-1.4.1-bin-hadoop2.6/bin/spark-submit 
--executor-memory 512m --proxy-user HIVEAPP --properties-file 
/tmp/spark-submit.4348738410387344124.properties --class 
org.apache.hive.spark.client.RemoteDriver 
/u01/app/apache-hive-1.2.1-bin/lib/hive-exec-1.2.1.jar --remote-host 
ip-10-92-82-229.ec2.internal --remote-port 48481 --conf 
hive.spark.client.connect.timeout=1000 --conf 
hive.spark.client.server.connect.timeout=9 --conf 
hive.spark.client.channel.log.level=null --conf 
hive.spark.client.rpc.max.size=52428800 --conf hive.spark.client.rpc.threads=8 
--conf hive.spark.client.secret.bits=256
2015-09-25 13:31:15,473 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.server.connect.timeout=9
2015-09-25 13:31:15,473 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.rpc.threads=8
2015-09-25 13:31:15,474 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.connect.timeout=1000
2015-09-25 13:31:15,474 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.secret.bits=256
2015-09-25 13:31:15,474 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.rpc.max.size=52428800
2015-09-25 13:31:15,718 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - 15/09/25 13:31:15 WARN util.NativeCodeLoader: 
Unable to load native-hadoop library for your platform... using builtin-java 
classes where applicable
2015-09-25 13:31:16,063 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - 15/09/25 13:31:16 INFO client.RMProxy: 
Connecting to ResourceManager at /0.0.0.0:8032
2015-09-25 13:31:16,245 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - ERROR: 
org.apache.hadoop.security.authorize.AuthorizationException: User: hadoop is 
not allowed to impersonate HIVEAPP
2015-09-25 13:31:16,248 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - 15/09/25 13:31:16 INFO util.Utils: Shutdown 
hook called
2015-09-25 13:31:16,265 WARN  [Driver]: client.SparkClientImpl 
(SparkClientImpl.java:run(427)) - Child process exited with code 1.

-Original Message-
From: Marcelo Vanzin [mailto:van...@cloudera.com] 
Sent: Friday, September 25, 2015 1:12 PM
To: Garry Chen <g...@cornell.edu>
Cc: Jimmy Xiang <jxi...@cloudera.com>; user@spark.apache.org
Subject: Re: hive on spark query error

On Fri, Sep 25, 2015 at 10:05 AM, Garry Chen <g...@cornell.edu> wrote:
> In spark-defaults.conf the spark.master  is  spark://hostname:7077.  
> From hive-site.xml  
> spark.master
> hostname
>   

That's not a valid value for spark.master (as the error indicates).
You should set it to "spark://hostname:7077", as you have it in 
spark-defaults.conf (or perhaps remove the setting from hive-site.xml, I think 
hive will honor your spark-defaults.conf).

--
Marcelo


RE: hive on spark query error

2015-09-25 Thread Garry Chen
In spark-defaults.conf the spark.master  is  spark://hostname:7077.  From 
hive-site.xml  
spark.master
hostname
  



From: Jimmy Xiang [mailto:jxi...@cloudera.com]
Sent: Friday, September 25, 2015 1:00 PM
To: Garry Chen <g...@cornell.edu>
Cc: user@spark.apache.org
Subject: Re: hive on spark query error

> Error: Master must start with yarn, spark, mesos, or local
What's your setting for spark.master?

On Fri, Sep 25, 2015 at 9:56 AM, Garry Chen 
<g...@cornell.edu<mailto:g...@cornell.edu>> wrote:
Hi All,
I am following 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started?
 to setup hive on spark.  After setup/configuration everything startup I am 
able to show tables but when executing sql statement within beeline I got 
error.  Please help and thank you very much.

Cluster Environment (3 nodes) as following
hadoop-2.7.1
spark-1.4.1-bin-hadoop2.6
zookeeper-3.4.6
apache-hive-1.2.1-bin

Error from hive log:
2015-09-25 11:51:03,123 INFO  [HiveServer2-Handler-Pool: Thread-50]: 
client.SparkClientImpl (SparkClientImpl.java:startDriver(375)) - Attempting 
impersonation of oracle
2015-09-25 11:51:03,133 INFO  [HiveServer2-Handler-Pool: Thread-50]: 
client.SparkClientImpl (SparkClientImpl.java:startDriver(409)) - Running client 
driver with argv: /u01/app/spark-1.4.1-bin-hadoop2.6/bin/spark-submit 
--proxy-user oracle --properties-file 
/tmp/spark-submit.840692098393819749.properties --class 
org.apache.hive.spark.client.RemoteDriver 
/u01/app/apache-hive-1.2.1-bin/lib/hive-exec-1.2.1.jar --remote-host 
ip-10-92-82-229.ec2.internal --remote-port 40476 --conf 
hive.spark.client.connect.timeout=1000 --conf 
hive.spark.client.server.connect.timeout=9 --conf 
hive.spark.client.channel.log.level=null --conf 
hive.spark.client.rpc.max.size=52428800 --conf hive.spark.client.rpc.threads=8 
--conf hive.spark.client.secret.bits=256
2015-09-25 11:51:03,867 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.server.connect.timeout=9
2015-09-25 11:51:03,868 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.rpc.threads=8
2015-09-25 11:51:03,868 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.connect.timeout=1000
2015-09-25 11:51:03,868 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.secret.bits=256
2015-09-25 11:51:03,868 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Warning: Ignoring non-spark config property: 
hive.spark.client.rpc.max.size=52428800
2015-09-25 11:51:03,876 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Error: Master must start with yarn, spark, 
mesos, or local
2015-09-25 11:51:03,876 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - Run with --help for usage help or --verbose 
for debug output
2015-09-25 11:51:03,885 INFO  [stderr-redir-1]: client.SparkClientImpl 
(SparkClientImpl.java:run(569)) - 15/09/25 11:51:03 INFO util.Utils: Shutdown 
hook called
2015-09-25 11:51:03,889 WARN  [Driver]: client.SparkClientImpl 
(SparkClientImpl.java:run(427)) - Child process exited with code 1.





start master failed with error

2015-08-31 Thread Garry Chen
Hi All,
Error when issued start-master.sh as following.  Even I have my 
CLASSPATH=/u01/app/jdk1.8.0_60/lib:/u01/app/apache-hive-1.2.1-bin/lib:/u01/app/slf4j-1.7.12
 set in environment profile.  Can anyone help?  Thank you very much.

Garry

Spark Command: /u01/app/jdk1.8.0_60/bin/java -cp 
/u01/app/spark-1.4.1-bin-without-hadoop/sbin/../conf/:/u01/app/spark-1.4.1-bin-without-hadoop/lib/spark-assembly-1.4.1-hadoop2.2.0.jar
 -Xms512m -Xmx512m org.apache.spark.deploy.master.Master --ip 
ip-10-92-82-229.ec2.internal --port 7077 --webui-port 8080

Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at 
sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
~


Spark-Ec2 lunch failed on starting httpd spark 141

2015-08-25 Thread Garry Chen
Hi All,
I am trying to lunch a spark cluster on ec2 with spark 1.4.1 
version.  The script finished but getting error at the end as following.  What 
should I do to correct this issue.  Thank you very much for your input.

Starting httpd: httpd: Syntax error on line 199 of /etc/httpd/conf/httpd.conf: 
Cannot load modules/libphp-5.5.so into server: 
/etc/httpd/modules/libphp-5.5.so: cannot open shared object file: No such file 
or directory


Garry



RE: Spark ec2 lunch problem

2015-08-24 Thread Garry Chen
So what is the best way to deploy spark cluster in EC2 environment any 
suggestions?

Garry

From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Friday, August 21, 2015 4:27 PM
To: Garry Chen g...@cornell.edu
Cc: user@spark.apache.org
Subject: Re: Spark ec2 lunch problem


It may happen that the version of spark-ec2 script you are using is buggy or 
sometime AWS have problem provisioning machines.
On Aug 21, 2015 7:56 AM, Garry Chen 
g...@cornell.edumailto:g...@cornell.edu wrote:
Hi All,
I am trying to lunch a spark ec2 cluster by running  spark-ec2 
--key-pair=key --identity-file=my.pem --vpc-id=myvpc --subnet-id=subnet-011 
--spark-version=1.4.1 launch spark-cluster but getting following message 
endless.  Please help.


Warning: SSH connection error. (This could be temporary.)
Host:
SSH return code: 255
SSH output: ssh: Could not resolve hostname : Name or service not known


Spark ec2 lunch problem

2015-08-21 Thread Garry Chen
Hi All,
I am trying to lunch a spark ec2 cluster by running  spark-ec2 
--key-pair=key --identity-file=my.pem --vpc-id=myvpc --subnet-id=subnet-011 
--spark-version=1.4.1 launch spark-cluster but getting following message 
endless.  Please help.


Warning: SSH connection error. (This could be temporary.)
Host:
SSH return code: 255
SSH output: ssh: Could not resolve hostname : Name or service not known


RE: Spark ec2 lunch problem

2015-08-21 Thread Garry Chen
No, the message never end.  I have to ctrl-c out of it.

Garry

From: shahid ashraf [mailto:sha...@trialx.com]
Sent: Friday, August 21, 2015 11:13 AM
To: Garry Chen g...@cornell.edu
Cc: user@spark.apache.org
Subject: Re: Spark ec2 lunch problem

Does the cluster work at the end ?

On Fri, Aug 21, 2015 at 8:25 PM, Garry Chen 
g...@cornell.edumailto:g...@cornell.edu wrote:
Hi All,
I am trying to lunch a spark ec2 cluster by running  spark-ec2 
--key-pair=key --identity-file=my.pem --vpc-id=myvpc --subnet-id=subnet-011 
--spark-version=1.4.1 launch spark-cluster but getting following message 
endless.  Please help.


Warning: SSH connection error. (This could be temporary.)
Host:
SSH return code: 255
SSH output: ssh: Could not resolve hostname : Name or service not known



--
with Regards
Shahid Ashraf