Tried running locally on a reasonably beefy machine and it worked fine.
 Which is the toy data, you're referring to?

JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64 SPARK_HOME=/root/spark
MAHOUT_HOME=. bin/mahout spark-itemsimilarity --input
s3n://recommendation-logs/2014/09/06 --output
s3n://recommendation-outputs/2014/09/06 --filenamePattern '.*' --recursive
--master spark://ec2-54-75-13-36.eu-west-1.compute.amazonaws.com:7077
--sparkExecutorMem 6g

and the working version running locally on a beefier box:

JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64 SPARK_HOME=/root/spark
MAHOUT_HOME=. MAHOUT_HEAPSIZE=16000 bin/mahout spark-itemsimilarity --input
s3n://ophan-recommendation-logs/2014/09/06 --output
s3n://ophan-recommendation-outputs/2014/09/06 --filenamePattern '.*'
--recursive  --sparkExecutorMem 16g

Sample input:

nnS1dIIBBtTnehVD79lgYeBw
http://www.example.com/world/2014/sep/05/malaysia-airlines-mh370-six-months-chinese-families-lack-answers

ikFSk14vHrTPqjSISvMihDUg
http://www.example.com/world/2014/sep/05/obama-core-coalition-10-countries-to-fight-isis

edqu8kfgsFSg2w3MhV5rUwuQ
http://www.example.com/lifeandstyle/wordofmouth/2014/sep/05/food-and-drink2?CMP=fb_gu

pfnmfONG1DQWG_EOOIxUASow
http://www.example.com/world/live/2014/sep/05/unresponsive-plane-f15-jets-aircraft-live-updates

pfUil_W0s2TZSqojMQrVcxVw        http://www.
example.com/football/blog/2014/sep/05/jose-mourinho-bargain-loic-remy-chelsea-france

nxTJnpyenFSP-tqWSLHQdW8w
http://www.example.com/books/2014/sep/05/were-we-happier-in-the-stone-age

lba37jwJVQS5GbiSuus1i6tA
http://www.example.com/stage/2014/sep/05/titus-andronicus-review-visually-striking-but-flawed

bEHaOzZPbtQz-X2K1wortBQQ
http://www.example.com/cities/2014/sep/05/death-america-suburban-dream-ferguson-missouri-resegregation

gjTGzDXiDOT5W2SThhm0tUmg
http://www.example.com/world/2014/sep/05/man-jailed-phoning-texting-ex-21807-times

pfFbQ5ddvBRhm0XLZbN6Xd2A
http://www.example.com/sport/2014/sep/05/gloucester-northampton-premiership-rugby



On Sun, Sep 14, 2014 at 4:06 PM, Pat Ferrel <[email protected]> wrote:

> I wonder if it’s trying to write an empty rdd to a text file. Can you give
> the CLI options and a snippet of data?
>
> Also have you successfully run this on the toy data in the resource dir?
> There is a script to run it locally that you can adapt for running on a
> cluster. This will eliminate any cluster problem.
>
>
> On Sep 13, 2014, at 1:13 PM, Phil Wills <[email protected]> wrote:
>
> Here's the master log from the line with the stack trace to termination:
>
> 14/09/12 15:54:55 INFO scheduler.DAGScheduler: Failed to run saveAsTextFile
> at TextDelimitedReaderWriter.scala:288
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due
> to stage failure: Task 8.0:3 failed 4 times, most recent failure: TID 448
> on host ip-10-105-176-77.eu-west-1.compute.internal failed for unknown
> reason
> Driver stacktrace:
> at org.apache.spark.scheduler.DAGScheduler.org
>
> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1049)
> at
>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1033)
> at
>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1031)
> at
>
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1031)
> at
>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635)
> at
>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635)
> at scala.Option.foreach(Option.scala:236)
> at
>
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:635)
> at
>
> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1234)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
> at
>
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
>
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
>
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 14/09/12 15:54:55 INFO scheduler.DAGScheduler: Executor lost: 8 (epoch 20)
> 14/09/12 15:54:55 INFO storage.BlockManagerMasterActor: Trying to remove
> executor 8 from BlockManagerMaster.
> 14/09/12 15:54:55 INFO storage.BlockManagerMaster: Removed 8 successfully
> in removeExecutor
> 14/09/12 15:54:55 INFO storage.BlockManagerInfo: Registering block manager
> ip-10-105-176-77.eu-west-1.compute.internal:58803 with 3.4 GB RAM
> 14/09/12 15:54:55 INFO cluster.SparkDeploySchedulerBackend: Registered
> executor:
> Actor[akka.tcp://[email protected]
> :56590/user/Executor#1456047585]
> with ID 9
>
> On Sat, Sep 13, 2014 at 4:21 PM, Pat Ferrel <[email protected]> wrote:
>
> > It’s not an error I’ve seen but they can tend to be pretty cryptic. Could
> > you post more of the stack trace?
> >
> > On Sep 12, 2014, at 2:55 PM, Phil Wills <[email protected]> wrote:
> >
> > I've tried on 1.0.1 and 1.0.2, updating the pom to 1.0.2 when running on
> > that.  I used the spark-ec2 scripts to set up the cluster.
> >
> > I might be able to share the data I'll mull it over the weekend to make
> > sure there's nothing sensitive, or if there's a way I can transform it to
> > that point.
> >
> > Phil
> >
> >
> > On Fri, Sep 12, 2014 at 6:30 PM, Pat Ferrel <[email protected]>
> wrote:
> >
> >> The mahout pom says 1.0.1 but I’m running fine on 1.0.2
> >>
> >>
> >> On Sep 12, 2014, at 10:08 AM, Pat Ferrel <[email protected]> wrote:
> >>
> >> Is it a mature Spark cluster, what version of Spark?
> >>
> >> If you can share the data I can try it on mine.
> >>
> >> On Sep 12, 2014, at 9:42 AM, Phil Wills <[email protected]> wrote:
> >>
> >> I've been experimenting with the fairly new ItemSimilarityDriver, which
> > is
> >> working fine up until the point it tries to write out it's results.
> >> Initially I was getting an issue with the akka frameSize being too
> small,
> >> but after expanding that I'm now getting a much more cryptic error:
> >>
> >> 14/09/12 15:54:55 INFO scheduler.DAGScheduler: Failed to run
> > saveAsTextFile
> >> at TextDelimitedReaderWriter.scala:288
> >> Exception in thread "main" org.apache.spark.SparkException: Job aborted
> > due
> >> to stage failure: Task 8.0:3 failed 4 times, most recent failure: TID
> 448
> >> on host ip-10-105-176-77.eu-west-1.compute.internal failed for unknown
> >> reason
> >>
> >> This is from the master node, but there doesn't seem to be anything more
> >> intelligible in the slave node logs.
> >>
> >> I've tried writing to the local file system as well as s3n and can see
> > it's
> >> not an access problem, as I am seeing a zero length file appear.
> >>
> >> Thanks for any pointers and apologies if this would be better to ask on
> > the
> >> Spark list,
> >>
> >> Phil
> >>
> >>
> >>
> >
> >
>
>

Reply via email to