Running on Yarn Getting an error with AtA. A user is running on those 1887 small ~4k Spark streaming files. The drms seem to be created properly. There may be empty rows in A—I’m having the user try with only AtA, no AtB and so no empty rows.
Any ideas? This is only 7.5M of data. I’ve tried a similar calc with the two larger files from epinions, and it works fine The task dies with Job aborted due to stage failure: Exception while getting task result: java.util.NoSuchElementException: key not found: 20070 The stack trace is: org.apache.spark.rdd.RDD.collect(RDD.scala:774) org.apache.mahout.sparkbindings.blas.AtA$.at_a_slim(AtA.scala:121) org.apache.mahout.sparkbindings.blas.AtA$.at_a(AtA.scala:50) org.apache.mahout.sparkbindings.SparkEngine$.tr2phys(SparkEngine.scala:231) org.apache.mahout.sparkbindings.SparkEngine$.tr2phys(SparkEngine.scala:242) org.apache.mahout.sparkbindings.SparkEngine$.toPhysical(SparkEngine.scala:108) org.apache.mahout.math.drm.logical.CheckpointAction.checkpoint(CheckpointAction.scala:40) org.apache.mahout.math.drm.package$.drm2Checkpointed(package.scala:90) org.apache.mahout.math.cf.SimilarityAnalysis$$anonfun$3.apply(SimilarityAnalysis.scala:129) org.apache.mahout.math.cf.SimilarityAnalysis$$anonfun$3.apply(SimilarityAnalysis.scala:127) scala.collection.Iterator$$anon$11.next(Iterator.scala:328) scala.collection.Iterator$class.foreach(Iterator.scala:727) scala.collection.AbstractIterator.foreach(Iterator.scala:1157) scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:176) scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:45) scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) scala.collection.AbstractIterator.to(Iterator.scala:1157) scala.collection.TraversableOnce$class.toList(TraversableOnce.scala:257) scala.collection.AbstractIterator.toList(Iterator.scala:1157)
