[jira] [Commented] (SPARK-6071) ALS doc example fails randomly in PythonAccumulatorParam

Joseph K. Bradley (JIRA) Wed, 03 Jun 2015 16:07:41 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571805#comment-14571805
 ]


Joseph K. Bradley commented on SPARK-6071:
------------------------------------------

I'm going to close this since I haven't heard of this happening since my 
initial report.

> ALS doc example fails randomly in PythonAccumulatorParam
> --------------------------------------------------------
>
>                 Key: SPARK-6071
>                 URL: https://issues.apache.org/jira/browse/SPARK-6071
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib, PySpark
>    Affects Versions: 1.3.0
>            Reporter: Joseph K. Bradley
>            Priority: Minor
>
> When running the ALS example in 
> [http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html#examples]
>  on branch-1.3, I got a random failure which I have been unable to reproduce.
> Specifically, I was running on the branch from this PR 
> [https://github.com/apache/spark/pull/4811] at this commit: 
> [https://github.com/mengxr/spark/commit/06140a48ec5bd55b329e9b7cf658bd3e43be4fe2]
> However, that PR should not have affected the bug, so I suspect it is within 
> branch-1.3 itself.
> After a clean build, I ran:
> {code}
> from pyspark.mllib.recommendation import ALS, Rating, MatrixFactorizationModel
> # Load and parse the data
> data = sc.textFile("data/mllib/als/test.data")
> ratings = data.map(lambda l: l.split(',')).map(lambda l: Rating(int(l[0]), 
> int(l[1]), float(l[2])))
> # Build the recommendation model using Alternating Least Squares
> rank = 10
> numIterations = 20
> model = ALS.train(ratings, rank, numIterations)
> {code}
> And I got this error:
> {code}
> >>> model = ALS.train(ratings, rank, numIterations)
> 15/02/27 14:41:24 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> 15/02/27 14:41:24 WARN LoadSnappy: Snappy native library not loaded
> 15/02/27 14:41:26 WARN BLAS: Failed to load implementation from: 
> com.github.fommil.netlib.NativeSystemBLAS
> 15/02/27 14:41:26 WARN BLAS: Failed to load implementation from: 
> com.github.fommil.netlib.NativeRefBLAS
> 15/02/27 14:41:26 WARN LAPACK: Failed to load implementation from: 
> com.github.fommil.netlib.NativeSystemLAPACK
> 15/02/27 14:41:26 WARN LAPACK: Failed to load implementation from: 
> com.github.fommil.netlib.NativeRefLAPACK
> 15/02/27 14:41:29 ERROR DAGScheduler: Failed to update accumulators for 
> ResultTask(279, 2)
> java.lang.ClassCastException: scala.None$ cannot be cast to java.util.List
>       at 
> org.apache.spark.api.python.PythonAccumulatorParam.addInPlace(PythonRDD.scala:745)
>       at org.apache.spark.Accumulable.$plus$plus$eq(Accumulators.scala:82)
>       at 
> org.apache.spark.Accumulators$$anonfun$add$2.apply(Accumulators.scala:340)
>       at 
> org.apache.spark.Accumulators$$anonfun$add$2.apply(Accumulators.scala:335)
>       at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>       at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>       at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>       at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>       at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>       at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>       at org.apache.spark.Accumulators$.add(Accumulators.scala:335)
>       at 
> org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:892)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:974)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1398)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1362)
>       at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 15/02/27 14:41:29 ERROR DAGScheduler: Failed to update accumulators for 
> ResultTask(279, 4)
> java.lang.ClassCastException: scala.None$ cannot be cast to java.util.List
>       at 
> org.apache.spark.api.python.PythonAccumulatorParam.addInPlace(PythonRDD.scala:745)
>       at org.apache.spark.Accumulable.$plus$plus$eq(Accumulators.scala:82)
>       at 
> org.apache.spark.Accumulators$$anonfun$add$2.apply(Accumulators.scala:340)
>       at 
> org.apache.spark.Accumulators$$anonfun$add$2.apply(Accumulators.scala:335)
>       at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>       at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>       at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>       at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>       at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>       at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>       at org.apache.spark.Accumulators$.add(Accumulators.scala:335)
>       at 
> org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:892)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:974)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1398)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1362)
>       at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> {code}
> However, re-running the same train() call immediately worked, and I have not 
> yet been able to reproduce the bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-6071) ALS doc example fails randomly in PythonAccumulatorParam

Reply via email to