Re: Use mvn run Spark program occur problem
That was it, thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Use-mvn-run-Spark-program-occur-problem-tp1751p6512.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: A Standalone App in Scala: Standalone mode issues
I finally got it working. Main points: - I had to add hadoop-client dependency to avoid a strange EOFException. - I had to set SPARK_MASTER_IP in conf/start-master.sh to hostname -f instead of hostname, since akka seems not to work properly with host names / ip, it requires fully qualified domain names. - I also set SPARK_MASTER_IP in conf/spark-env.sh to hostname -f so that other workers can reach the master. - Be sure that conf/slaves also contains fully qualified domain names. - It seems that both master and workers need to have access to the driver client and since I was within a VPN I had lot of troubles with this. It took some time but I finally realized it. Making these changes, everything just worked like a charm! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/A-Standalone-App-in-Scala-Standalone-mode-issues-tp6493p6514.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Is uberjar a recommended way of running Spark/Scala applications?
Hi Andrei, I think the preferred way to deploy Spark jobs is by using the sbt package task instead of using the sbt assembly plugin. In any case, as you comment, the mergeStrategy in combination with some dependency exlusions should fix your problems. Have a look at this gist https://gist.github.com/JordiAranda/bdbad58d128c14277a05 for further details (I just followed some recommendations commented in the sbt assembly plugin documentation). Up to now I haven't found a proper way to combine my development/deployment phases, although I must say my experience in Spark is pretty poor (it really depends in your deployment requirements as well). In this case, I think someone else could give you some further insights. Best, -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Is-uberjar-a-recommended-way-of-running-Spark-Scala-applications-tp6518p6520.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Akka Connection refused - standalone cluster using spark-0.9.0
Same here, got stuck at this point. Any hints on what might be going on? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Akka-Connection-refused-standalone-cluster-using-spark-0-9-0-tp1297p6463.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
A Standalone App in Scala: Standalone mode issues
During the last few days I've been trying to deploy a Scala job to a standalone cluster (master + 4 workers) without much success, although it worked perfectly when launching it from the spark shell, that is, using the Scala REPL (pretty strange, this would mean my cluster config was actually correct). In order to test it with a simpler example, I decided to deploy this example https://spark.apache.org/docs/0.9.0/quick-start.html#a-standalone-app-in-scala in standalone mode(master + 1 worker, same machine). Please have a look at this gist https://gist.github.com/JordiAranda/4ee54f84dc92f02ecb8c for the cluster setup. I can't get rid of the EOFException. So, I should definitely be missing something. Why it works when setting the master config property to local[x] or launching it from the REPL, and not when setting the master config property as an spark url? PS: Please, notice I am using the latest release (0.9.1) prebuilt for Hadoop 2 Thanks, -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/A-Standalone-App-in-Scala-Standalone-mode-issues-tp6493.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: KryoSerializer Exception
I am experiencing the same issue (I tried both using Kryo as serializer and increasing the buffer size up to 256M, my objects are much smaller though). I share my registrator class just in case: https://gist.github.com/JordiAranda/5cc16cf102290c413c82 Any hints would be highly appreciated. Thanks, -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/KryoSerializer-Exception-tp5435p6428.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: file not found
Thanks for the heads up, I also experienced this issue. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/file-not-found-tp1854p6438.html Sent from the Apache Spark User List mailing list archive at Nabble.com.