Re: Use mvn run Spark program occur problem

2014-05-29 Thread jaranda
That was it, thanks!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Use-mvn-run-Spark-program-occur-problem-tp1751p6512.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: A Standalone App in Scala: Standalone mode issues

2014-05-29 Thread jaranda
I finally got it working. Main points:

- I had to add hadoop-client dependency to avoid a strange EOFException.
- I had to set SPARK_MASTER_IP in conf/start-master.sh to hostname -f
instead of hostname, since akka seems not to work properly with host names /
ip, it requires fully qualified domain names.
- I also set SPARK_MASTER_IP in conf/spark-env.sh to hostname -f so that
other workers can reach the master.
- Be sure that conf/slaves also contains fully qualified domain names.
- It seems that both master and workers need to have access to the driver
client and since I was within a VPN I had lot of troubles with this. It took
some time but I finally realized it.

Making these changes, everything just worked like a charm!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/A-Standalone-App-in-Scala-Standalone-mode-issues-tp6493p6514.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: Is uberjar a recommended way of running Spark/Scala applications?

2014-05-29 Thread jaranda
Hi Andrei,

I think the preferred way to deploy Spark jobs is by using the sbt package
task instead of using the sbt assembly plugin. In any case, as you comment,
the mergeStrategy in combination with some dependency exlusions should fix
your problems. Have a look at  this gist
https://gist.github.com/JordiAranda/bdbad58d128c14277a05   for further
details (I just followed some recommendations commented in the sbt assembly
plugin documentation).

Up to now I haven't found a proper way to combine my development/deployment
phases, although I must say my experience in Spark is pretty poor (it really
depends in your deployment requirements as well). In this case, I think
someone else could give you some further insights.

Best,



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Is-uberjar-a-recommended-way-of-running-Spark-Scala-applications-tp6518p6520.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: Akka Connection refused - standalone cluster using spark-0.9.0

2014-05-28 Thread jaranda
Same here, got stuck at this point. Any hints on what might be going on?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Akka-Connection-refused-standalone-cluster-using-spark-0-9-0-tp1297p6463.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


A Standalone App in Scala: Standalone mode issues

2014-05-28 Thread jaranda
During the last few days I've been trying to deploy a Scala job to a
standalone cluster (master + 4 workers) without much success, although it
worked perfectly when launching it from the spark shell, that is, using the
Scala REPL (pretty strange, this would mean my cluster config was actually
correct).

In order to test it with a simpler example, I decided to deploy  this
example
https://spark.apache.org/docs/0.9.0/quick-start.html#a-standalone-app-in-scala
  
in standalone mode(master + 1 worker, same machine). Please have a look at 
this gist https://gist.github.com/JordiAranda/4ee54f84dc92f02ecb8c   for
the cluster setup. I can't get rid of the EOFException.

So, I should definitely be missing something. Why it works when setting the
master config property to local[x] or launching it from the REPL, and not
when setting the master config property as an spark url?

PS: Please, notice I am using the latest release (0.9.1) prebuilt for Hadoop
2

Thanks,



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/A-Standalone-App-in-Scala-Standalone-mode-issues-tp6493.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: KryoSerializer Exception

2014-05-27 Thread jaranda
I am experiencing the same issue (I tried both using Kryo as serializer and
increasing the buffer size up to 256M, my objects are much smaller though).
I share my registrator class just in case:

https://gist.github.com/JordiAranda/5cc16cf102290c413c82

Any hints would be highly appreciated.

Thanks,




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/KryoSerializer-Exception-tp5435p6428.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: file not found

2014-05-27 Thread jaranda
Thanks for the heads up, I also experienced this issue.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/file-not-found-tp1854p6438.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.