Re: spark-itemsimilarity can't launch on a Spark cluster?

pol Wed, 24 Sep 2014 20:19:29 -0700

Hi, Pat
        Dataset is the same, and the data is very few for test. This is a bug?



On Sep 25, 2014, at 02:57, Pat Ferrel <[email protected]> wrote:

> Are you using different data sets on the local and cluster?
> 
> Try increasing spark memory with -sem, I use -sem 6g for the epinions data 
> set.
> 
> The ID dictionaries are kept in-memory on each cluster machine so a large 
> number of user or item IDs will need more memory.
> 
> 
> On Sep 24, 2014, at 9:31 AM, pol <[email protected]> wrote:
> 
> Hi, All
>       
>       I’m sure it’s ok that launching Spark standalone to a cluster, but it 
> can’t work used for spark-itemsimilarity.
> 
>       Launching on 'local' it’s ok:
> mahout spark-itemsimilarity -i /user/root/test/input/data.txt -o 
> /user/root/test/output -os -ma local[2] -f1 purchase -f2 view -ic 2 -fc 1 
> -sem 1g
> 
>       but launching on a standalone cluster will be an error:
> mahout spark-itemsimilarity -i /user/root/test/input/data.txt -o 
> /user/root/test/output -os -ma spark://Hadoop.Master:7077 -f1 purchase -f2 
> view -ic 2 -fc 1 -sem 1g
> ------------
> 14/09/22 04:12:47 WARN scheduler.TaskSchedulerImpl: Initial job has not 
> accepted any resources; check your cluster UI to ensure that workers are 
> registered and have sufficient memory
> 14/09/22 04:12:49 INFO client.AppClient$ClientActor: Connecting to master 
> spark://Hadoop.Master:7077...
> 14/09/22 04:13:02 WARN scheduler.TaskSchedulerImpl: Initial job has not 
> accepted any resources; check your cluster UI to ensure that workers are 
> registered and have sufficient memory
> 14/09/22 04:13:09 INFO client.AppClient$ClientActor: Connecting to master 
> spark://Hadoop.Master:7077...
> 14/09/22 04:13:17 WARN scheduler.TaskSchedulerImpl: Initial job has not 
> accepted any resources; check your cluster UI to ensure that workers are 
> registered and have sufficient memory
> 14/09/22 04:13:29 ERROR cluster.SparkDeploySchedulerBackend: Application has 
> been killed. Reason: All masters are unresponsive! Giving up.
> 14/09/22 04:13:29 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 1.0, 
> whose tasks have all completed, from pool 
> 14/09/22 04:13:29 INFO scheduler.TaskSchedulerImpl: Cancelling stage 1
> 14/09/22 04:13:29 INFO scheduler.DAGScheduler: Failed to run collect at 
> TextDelimitedReaderWriter.scala:74
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due 
> to stage failure: All masters are unresponsive! Giving up.
>       at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1044)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1028)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1026)
>       at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>       at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>       at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1026)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:634)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:634)
>       at scala.Option.foreach(Option.scala:236)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:634)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1229)
>       at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>       at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>       at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>       at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>       at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>       at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>       at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>       at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>       at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> ------------
> 
> Thanks.
> 
>

Re: spark-itemsimilarity can't launch on a Spark cluster?

Reply via email to