I've been experimenting with the fairly new ItemSimilarityDriver, which is working fine up until the point it tries to write out it's results. Initially I was getting an issue with the akka frameSize being too small, but after expanding that I'm now getting a much more cryptic error:
14/09/12 15:54:55 INFO scheduler.DAGScheduler: Failed to run saveAsTextFile at TextDelimitedReaderWriter.scala:288 Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 8.0:3 failed 4 times, most recent failure: TID 448 on host ip-10-105-176-77.eu-west-1.compute.internal failed for unknown reason This is from the master node, but there doesn't seem to be anything more intelligible in the slave node logs. I've tried writing to the local file system as well as s3n and can see it's not an access problem, as I am seeing a zero length file appear. Thanks for any pointers and apologies if this would be better to ask on the Spark list, Phil
