[ https://issues.apache.org/jira/browse/SPARK-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15579901#comment-15579901 ]
holdenk commented on SPARK-10319: --------------------------------- Is this issue still occurring for you? > ALS training using PySpark throws a StackOverflowError > ------------------------------------------------------ > > Key: SPARK-10319 > URL: https://issues.apache.org/jira/browse/SPARK-10319 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 1.4.1 > Environment: Windows 10, spark - 1.4.1, > Reporter: Velu nambi > > When attempting to train a machine learning model using ALS in Spark's MLLib > (1.4) on windows, Pyspark always terminates with a StackoverflowError. I > tried adding the checkpoint as described in > http://stackoverflow.com/a/31484461/36130 -- doesn't seem to help. > Here's the training code and stack trace: > {code:none} > ranks = [8, 12] > lambdas = [0.1, 10.0] > numIters = [10, 20] > bestModel = None > bestValidationRmse = float("inf") > bestRank = 0 > bestLambda = -1.0 > bestNumIter = -1 > for rank, lmbda, numIter in itertools.product(ranks, lambdas, numIters): > ALS.checkpointInterval = 2 > model = ALS.train(training, rank, numIter, lmbda) > validationRmse = computeRmse(model, validation, numValidation) > if (validationRmse < bestValidationRmse): > bestModel = model > bestValidationRmse = validationRmse > bestRank = rank > bestLambda = lmbda > bestNumIter = numIter > testRmse = computeRmse(bestModel, test, numTest) > {code} > Stacktrace: > 15/08/27 02:02:58 ERROR Executor: Exception in task 3.0 in stage 56.0 (TID > 127) > java.lang.StackOverflowError > at java.io.ObjectInputStream$BlockDataInputStream.readInt(Unknown Source) > at java.io.ObjectInputStream.readHandle(Unknown Source) > at java.io.ObjectInputStream.readClassDesc(Unknown Source) > at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) > at java.io.ObjectInputStream.readObject0(Unknown Source) > at java.io.ObjectInputStream.defaultReadFields(Unknown Source) > at java.io.ObjectInputStream.readSerialData(Unknown Source) > at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) > at java.io.ObjectInputStream.readObject0(Unknown Source) > at java.io.ObjectInputStream.defaultReadFields(Unknown Source) > at java.io.ObjectInputStream.readSerialData(Unknown Source) > at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) > at java.io.ObjectInputStream.readObject0(Unknown Source) > at java.io.ObjectInputStream.defaultReadFields(Unknown Source) > at java.io.ObjectInputStream.readSerialData(Unknown Source) > at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) > at java.io.ObjectInputStream.readObject0(Unknown Source) > at java.io.ObjectInputStream.readObject(Unknown Source) > at scala.collection.immutable.$colon$colon.readObject(List.scala:362) > at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at java.io.ObjectStreamClass.invokeReadObject(Unknown Source) > at java.io.ObjectInputStream.readSerialData(Unknown Source) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org