Re: akka error : play framework (2.3.3) and spark (1.0.2)

2014-08-16 Thread Manu Suryavansh
Hi,

I tried the Spark(1.0.0)+Play(2.3.3) example from the Knoldus blog -
http://blog.knoldus.com/2014/06/18/play-with-spark-building-apache-spark-with-play-framework/
and
it worked for me. The project is here -
https://github.com/knoldus/Play-Spark-Scala

Regards,
Manu


On Sat, Aug 16, 2014 at 11:04 PM, Sujee Maniyam  wrote:

> Hi
>
> I am trying to connect to Spark from Play framework. Getting the following
> Akka error...
>
> [ERROR] [08/16/2014 17:12:05.249] [spark-akka.actor.default-dispatcher-3] 
> [ActorSystem(spark)] Uncaught fatal error from thread 
> [spark-akka.actor.default-dispatcher-3] shutting down ActorSystem [spark]
>
> java.lang.AbstractMethodError
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>
>   at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
>
> full stack trace : https://gist.github.com/sujee/ff14fd602b76314e693d
>
> source code here : https://github.com/sujee/play-spark-test
>
> I have also found this thread mentioning Akka in-compatibility How to run
> Play 2.2.x with Akka 2.3.x?
> 
>
> Stack overflow thread :
> http://stackoverflow.com/questions/25346657/akka-error-play-framework-2-3-3-and-spark-1-0-2
>
> any suggestions?
>
> thanks!
>
> Sujee Maniyam (http://sujee.net | http://www.linkedin.com/in/sujeemaniyam
> )
>



-- 
Manu Suryavansh


akka error : play framework (2.3.3) and spark (1.0.2)

2014-08-16 Thread Sujee Maniyam
Hi

I am trying to connect to Spark from Play framework. Getting the following
Akka error...

[ERROR] [08/16/2014 17:12:05.249]
[spark-akka.actor.default-dispatcher-3] [ActorSystem(spark)] Uncaught
fatal error from thread [spark-akka.actor.default-dispatcher-3]
shutting down ActorSystem [spark]
java.lang.AbstractMethodError
  at 
akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
  at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
  at akka.actor.ActorCell.terminate(ActorCell.scala:369)
  at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
  at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
  at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
  at akka.dispatch.Mailbox.run(Mailbox.scala:219)
  at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
  at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
  at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
  at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
  at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)


full stack trace : https://gist.github.com/sujee/ff14fd602b76314e693d

source code here : https://github.com/sujee/play-spark-test

I have also found this thread mentioning Akka in-compatibility How to run
Play 2.2.x with Akka 2.3.x?


Stack overflow thread :
http://stackoverflow.com/questions/25346657/akka-error-play-framework-2-3-3-and-spark-1-0-2

any suggestions?

thanks!

Sujee Maniyam (http://sujee.net | http://www.linkedin.com/in/sujeemaniyam )


Re: acquire and give back resources dynamically

2014-08-16 Thread fireflyc
http://spark.apache.org/docs/latest/running-on-yarn.html
Spark just a Yarn application


> 在 2014年8月14日,11:12,牛兆捷  写道:
> 
> Dear all:
> 
> Does spark can acquire resources from and give back resources to
> YARN dynamically ?
> 
> 
> -- 
> *Regards,*
> *Zhaojie*


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: spark.akka.frameSize stalls job in 1.1.0

2014-08-16 Thread Jerry Ye
Hi Xiangrui,
I actually tried branch-1.1 and master and it resulted in the job being
stuck at the TaskSetManager:
14/08/16 06:55:48 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0
with 2 tasks
14/08/16 06:55:48 INFO scheduler.TaskSetManager: Starting task 1.0:0 as TID
2 on executor 8: ip-10-226-199-225.us-west-2.compute.internal
(PROCESS_LOCAL)
14/08/16 06:55:48 INFO scheduler.TaskSetManager: Serialized task 1.0:0 as
28055875 bytes in 162 ms
14/08/16 06:55:48 INFO scheduler.TaskSetManager: Starting task 1.0:1 as TID
3 on executor 0: ip-10-249-53-62.us-west-2.compute.internal (PROCESS_LOCAL)
14/08/16 06:55:48 INFO scheduler.TaskSetManager: Serialized task 1.0:1 as
28055875 bytes in 178 ms

It's been 10 minutes with no progress on relatively small data. I'll let it
run overnight and update in the morning. Is there some place that I should
look to see what is happening? I tried to ssh into the executor and look at
/root/spark/logs but there wasn't anything informative there.

I'm sure using CountByValue works fine but my use of a HashMap is only an
example. In my actual task, I'm loading a Trie data structure to perform
efficient string matching between a dataset of locations and strings
possibly containing mentions of locations.

This seems like a common thing, to process input with a relatively memory
intensive object like a Trie. I hope I'm not missing something obvious. Do
you know of any example code like my use case?

Thanks!

- jerry




On Fri, Aug 15, 2014 at 10:02 PM, Xiangrui Meng  wrote:

> Just saw you used toArray on an RDD. That copies all data to the
> driver and it is deprecated. countByValue is what you need:
>
> val samples = sc.textFile("s3n://geonames")
> val counts = samples.countByValue()
> val result = samples.map(l => (l, counts.getOrElse(l, 0L))
>
> Could you also try to use the latest branch-1.1 or master with the
> default akka.frameSize setting? The serialized task size should be
> small because we now use broadcast RDD objects.
>
> -Xiangrui
>
> On Fri, Aug 15, 2014 at 5:11 PM, jerryye  wrote:
> > Hi Xiangrui,
> > You were right, I had to use --driver_memory instead of setting it in
> > spark-defaults.conf.
> >
> > However, now my just hangs with the following message:
> > 4/08/15 23:54:46 INFO scheduler.TaskSetManager: Serialized task 1.0:0 as
> > 29433434 bytes in 202 ms
> > 14/08/15 23:54:46 INFO scheduler.TaskSetManager: Starting task 1.0:1 as
> TID
> > 3 on executor 1: ip-10-226-198-31.us-west-2.compute.internal
> (PROCESS_LOCAL)
> > 14/08/15 23:54:46 INFO scheduler.TaskSetManager: Serialized task 1.0:1 as
> > 29433434 bytes in 203 ms
> >
> > Any ideas on where else to look?
> >
> >
> > On Fri, Aug 15, 2014 at 3:29 PM, Xiangrui Meng [via Apache Spark
> Developers
> > List]  wrote:
> >
> >> Did you verify the driver memory in the Executor tab of the WebUI? I
> >> think you need `--driver-memory 8g` with spark-shell or spark-submit
> >> instead of setting it in spark-defaults.conf.
> >>
> >> On Fri, Aug 15, 2014 at 12:41 PM, jerryye <[hidden email]
> >> > wrote:
> >>
> >> > Setting spark.driver.memory has no effect. It's still hanging trying
> to
> >> > compute result.count when I'm sampling greater than 35% regardless of
> >> what
> >> > value of spark.driver.memory I'm setting.
> >> >
> >> > Here's my settings:
> >> > export SPARK_JAVA_OPTS="-Xms5g -Xmx10g -XX:MaxPermSize=10g"
> >> > export SPARK_MEM=10g
> >> >
> >> > in conf/spark-defaults:
> >> > spark.driver.memory 1500
> >> > spark.serializer org.apache.spark.serializer.KryoSerializer
> >> > spark.kryoserializer.buffer.mb 500
> >> > spark.executor.memory 58315m
> >> > spark.executor.extraLibraryPath /root/ephemeral-hdfs/lib/native/
> >> > spark.executor.extraClassPath /root/ephemeral-hdfs/conf
> >> >
> >> >
> >> >
> >> > --
> >> > View this message in context:
> >>
> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-akka-frameSize-stalls-job-in-1-1-0-tp7865p7877.html
> >>
> >> > Sent from the Apache Spark Developers List mailing list archive at
> >> Nabble.com.
> >> >
> >> > -
> >> > To unsubscribe, e-mail: [hidden email]
> >> 
> >> > For additional commands, e-mail: [hidden email]
> >> 
> >> >
> >>
> >> -
> >> To unsubscribe, e-mail: [hidden email]
> >> 
> >> For additional commands, e-mail: [hidden email]
> >> 
> >>
> >>
> >>
> >> --
> >>  If you reply to this email, your message will be added to the
> discussion
> >> below:
> >>
> >>
> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-akka-frameSize-stalls-job-in-1-1-0-tp7865p7883.html
> >>  To start a new topic under Apach

Re: Extra libs for bin/spark-shell - specifically for hbase

2014-08-16 Thread Sandy Ryza
Hi Stephen,

Have you tried the --jars option (with jars separated by commas)?  It
should make the given jars available both to the driver and the executors.
 I believe one caveat currently is that if you give it a folder it won't
pick up all the jars inside.

-Sandy


On Fri, Aug 15, 2014 at 4:07 PM, Stephen Boesch  wrote:

> Although this has been discussed a number of times here, I am still unclear
> how to add user jars to the spark-shell:
>
> a) for importing classes for use directly within the shell interpreter
>
> b) for  invoking SparkContext commands with closures referencing user
> supplied classes contained within jar's.
>
> Similarly to other posts, I have gone through:
>
>  updating bin/spark-env.sh
>  SPARK_CLASSPATH
>  SPARK_SUBMIT_OPTS
>   creating conf/spark-defaults.conf  and adding
>  spark.executor.extraClassPath
> --driver-class-path
>   etc
>
> Hopefully there would be something along the lines of  a single entry added
> to some claspath somewhere like this
>
>SPARK_CLASSPATH/driver-class-path/spark.executor.extraClassPath (or
> whatever is the correct option..)  =
> $HBASE_HOME/*:$HBASE_HOME/lib/*:$SPARK_CLASSPATH
>
> Any ideas here?
>
> thanks
>