[ 
https://issues.apache.org/jira/browse/SPARK-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-15403.
-------------------------------
    Resolution: Duplicate

(Don't set Blocker; read 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark) Looks 
like this is a duplicate

> LinearRegressionWithSGD fails on files more than 12Mb data 
> -----------------------------------------------------------
>
>                 Key: SPARK-15403
>                 URL: https://issues.apache.org/jira/browse/SPARK-15403
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>    Affects Versions: 1.6.1
>         Environment: Ubuntu 14.04 with 8 Gb Ram, scala 2.11.7 with following 
> memory settings for my project:  JAVA_OPTS="-Xmx8G -Xms2G" .
>            Reporter: Ana La
>            Priority: Blocker
>
> I parse my json-like data, passing by DataFrame and SparkSql facilities and 
> then scale one numerical feature and create dummy variables for categorical 
> features. So far from initial 14 keys of my json-like file I get about 
> 200-240 features in the final LabeledPoint. The final data is sparse and 
> every file contains as minimum 50000 of observations. I try to run two types 
> of algorithms on data :  LinearRegressionWithSGD or LassoWithSGD, since the 
> data is sparse and regularization might be required.  For data larger than 
> 11MB  LinearRegressionWithSGD fails with the following error:
> {quote} org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 58 in stage 346.0 failed 1 times, most recent failure: Lost task 58.0 in 
> stage 346.0 (TID 18140, localhost): ExecutorLostFailure (executor driver 
> exited caused by one of the running tasks) Reason: Executor heartbeat timed 
> out after 179307 ms {quote}
> I tried to reproduce this bug with bug in smaller example, and I suppose that 
> something wrong could be with LinearRegressionWithSGD on large sets of data. 
> I notices that while using StandardScaler on preprocessing step and counts on 
> Linear Regression step, collect() method is perform, that can cause the bug. 
> So the possibility to scale Linear regression is questioned, cause, as I far 
> as I understand it, collect() performs on driver and so the sens of scaled 
> calculations is lost. 
> {code:scala}
> import java.io.{File}
> import org.apache.spark.mllib.linalg.{Vectors}
> import org.apache.spark.mllib.regression.{LabeledPoint, LassoWithSGD}
> import org.apache.spark.rdd.RDD
> import org.apache.spark.{SparkConf, SparkContext}
> import org.apache.spark.sql.{SQLContext}
> import scala.language.postfixOps
> object Main2 {
>   def main(args: Array[String]): Unit = {
>     // Spark configuration is defined for execution on local computer, 4 
> cores 8Mb Ram
>     val conf = new SparkConf()
>       .setMaster(s"local[*]")
>       .setAppName("spark_linear_regression_bug_report")
>       //multiple configurations were tried for driver/executor memories, 
> including default configurations
>       .set("spark.driver.memory", "3g")
>       .set("spark.executor.memory", "3g")
>       .set("spark.executor.heartbeatInterval", "30s")
>     // Spark context and SQL context definitions
>     val sc = new SparkContext(conf)
>     val sqlContext = new SQLContext(sc)
>     val countFeatures = 500
>     val countList = 500000
>     val features = sc.broadcast(1 to countFeatures)
>     val rdd: RDD[LabeledPoint] = sc.range(1, countList).map { i =>
>       LabeledPoint(
>         label = i.toDouble,
>         features = Vectors.dense(features.value.map(_ => 
> scala.util.Random.nextInt(2).toDouble).toArray)
>       )
>     }.persist()
>     val numIterations = 1000
>     val stepSize = 0.3
>     val algorithm = new LassoWithSGD() //LassoWithSGD() 
>     algorithm.setIntercept(true)
>     algorithm.optimizer
>       .setNumIterations(numIterations)
>       .setStepSize(stepSize)
>     val model = algorithm.run(rdd)
>   }
> }
> {code}
> the complete Error of the bug :
> {quote}  [info] Running Main 
> WARN  org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> WARN  org.apache.spark.util.Utils - Your hostname, julien-ubuntu resolves to 
> a loopback address: 127.0.1.1; using 192.168.0.49 instead (on interface wlan0)
> WARN  org.apache.spark.util.Utils - Set SPARK_LOCAL_IP if you need to bind to 
> another address
> INFO  akka.event.slf4j.Slf4jLogger - Slf4jLogger started
> INFO  Remoting - Starting remoting
> INFO  Remoting - Remoting started; listening on addresses 
> :[akka.tcp://[email protected]:59897]
> INFO  org.spark-project.jetty.server.Server - jetty-8.y.z-SNAPSHOT
> INFO  org.spark-project.jetty.server.AbstractConnector - Started 
> [email protected]:4040
> WARN  com.github.fommil.netlib.BLAS - Failed to load implementation from: 
> com.github.fommil.netlib.NativeSystemBLAS
> WARN  com.github.fommil.netlib.BLAS - Failed to load implementation from: 
> com.github.fommil.netlib.NativeRefBLAS
> [Stage 51:===========================================>              (3 + 1) / 
> 4]ERROR org.apache.spark.util.Utils - Uncaught exception in thread 
> driver-heartbeater
> java.io.IOException: java.lang.ClassNotFoundException: 
> org.apache.spark.storage.BroadcastBlockId
>       at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1207) 
> ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.executor.TaskMetrics.readObject(TaskMetrics.scala:219) 
> ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source) ~[na:na]
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_91]
>       at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91]
>       at 
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) 
> ~[na:1.8.0_91]
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) 
> ~[na:1.8.0_91]
>       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) 
> ~[na:1.8.0_91]
>       at org.apache.spark.util.Utils$.deserialize(Utils.scala:92) 
> ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$reportHeartBeat$1$$anonfun$apply$6.apply(Executor.scala:437)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$reportHeartBeat$1$$anonfun$apply$6.apply(Executor.scala:427)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at scala.Option.foreach(Option.scala:257) 
> ~[scala-library-2.11.7.jar:1.0.0-M1]
>       at 
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$reportHeartBeat$1.apply(Executor.scala:427)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$reportHeartBeat$1.apply(Executor.scala:425)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at scala.collection.Iterator$class.foreach(Iterator.scala:742) 
> ~[scala-library-2.11.7.jar:1.0.0-M1]
>       at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) 
> ~[scala-library-2.11.7.jar:1.0.0-M1]
>       at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) 
> ~[scala-library-2.11.7.jar:1.0.0-M1]
>       at scala.collection.AbstractIterable.foreach(Iterable.scala:54) 
> ~[scala-library-2.11.7.jar:1.0.0-M1]
>       at 
> org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:425)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:470)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:470)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765) 
> ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:470) 
> [spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_91]
>       at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [na:1.8.0_91]
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [na:1.8.0_91]
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [na:1.8.0_91]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_91]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_91]
>       at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.spark.storage.BroadcastBlockId
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
> ~[na:1.8.0_91]
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_91]
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_91]
>       at java.lang.Class.forName0(Native Method) ~[na:1.8.0_91]
>       at java.lang.Class.forName(Class.java:348) ~[na:1.8.0_91]
>       at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:628) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1620) 
> ~[na:1.8.0_91]
>       at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1781) 
> ~[na:1.8.0_91]
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) 
> ~[na:1.8.0_91]
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) 
> ~[na:1.8.0_91]
>       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) 
> ~[na:1.8.0_91]
>       at 
> scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479) 
> ~[scala-library-2.11.7.jar:1.0.0-M1]
>       at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) ~[na:na]
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_91]
>       at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_91]
>       at 
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) 
> ~[na:1.8.0_91]
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) 
> ~[na:1.8.0_91]
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) 
> ~[na:1.8.0_91]
>       at 
> java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:503) 
> ~[na:1.8.0_91]
>       at 
> org.apache.spark.executor.TaskMetrics$$anonfun$readObject$1.apply$mcV$sp(TaskMetrics.scala:220)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1204) 
> ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       ... 32 common frames omitted
> WARN  org.apache.spark.HeartbeatReceiver - Removing executor driver with no 
> recent heartbeats: 175339 ms exceeds timeout 120000 ms
> ERROR org.apache.spark.scheduler.TaskSchedulerImpl - Lost executor driver on 
> localhost: Executor heartbeat timed out after 175339 ms
> WARN  org.apache.spark.scheduler.TaskSetManager - Lost task 1.0 in stage 
> 105.0 (TID 420, localhost): ExecutorLostFailure (executor driver exited 
> caused by one of the running tasks) Reason: Executor heartbeat timed out 
> after 175339 ms
> ERROR org.apache.spark.scheduler.TaskSetManager - Task 1 in stage 105.0 
> failed 1 times; aborting job
> WARN  org.apache.spark.SparkContext - Killing executors is only supported in 
> coarse-grained mode
> [error] (run-main-0) org.apache.spark.SparkException: Job aborted due to 
> stage failure: Task 1 in stage 105.0 failed 1 times, most recent failure: 
> Lost task 1.0 in stage 105.0 (TID 420, localhost): ExecutorLostFailure 
> (executor driver exited caused by one of the running tasks) Reason: Executor 
> heartbeat timed out after 175339 ms
> [error] Driver stacktrace:
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in 
> stage 105.0 failed 1 times, most recent failure: Lost task 1.0 in stage 105.0 
> (TID 420, localhost): ExecutorLostFailure (executor driver exited caused by 
> one of the running tasks) Reason: Executor heartbeat timed out after 175339 ms
> Driver stacktrace:
>       at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
>       at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>       at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>       at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
>       at scala.Option.foreach(Option.scala:257)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
>       at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>       at 
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
>       at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
>       at org.apache.spark.SparkContext.runJob(SparkContext.scala:1952)
>       at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1025)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
>       at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
>       at org.apache.spark.rdd.RDD.reduce(RDD.scala:1007)
>       at 
> org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1150)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
>       at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
>       at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1127)
>       at 
> org.apache.spark.mllib.optimization.GradientDescent$.runMiniBatchSGD(GradientDescent.scala:227)
>       at 
> org.apache.spark.mllib.optimization.GradientDescent.optimize(GradientDescent.scala:128)
>       at 
> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:308)
>       at 
> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:229)
>       at Main$.main(Main.scala:85)
>       at Main.main(Main.scala)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
> [trace] Stack trace suppressed: run last compile:run for the full output.
> ERROR org.apache.spark.ContextCleaner - Error in cleaning thread
> java.lang.InterruptedException: null
>       at java.lang.Object.wait(Native Method) ~[na:1.8.0_91]
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) 
> ~[na:1.8.0_91]
>       at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:176)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1180) 
> [spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:173)
>  [spark-core_2.11-1.6.1.jar:1.6.1]
>       at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:68) 
> [spark-core_2.11-1.6.1.jar:1.6.1]
> ERROR org.apache.spark.util.Utils - uncaught error in thread 
> SparkListenerBus, stopping SparkContext
> java.lang.InterruptedException: null
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
>  ~[na:1.8.0_91]
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  ~[na:1.8.0_91]
>       at java.util.concurrent.Semaphore.acquire(Semaphore.java:312) 
> ~[na:1.8.0_91]
>       at 
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:66)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) 
> ~[scala-library-2.11.7.jar:1.0.0-M1]
>       at 
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
>  ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1180) 
> ~[spark-core_2.11-1.6.1.jar:1.6.1]
>       at 
> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
>  [spark-core_2.11-1.6.1.jar:1.6.1]
> java.lang.RuntimeException: Nonzero exit code: 1
>       at scala.sys.package$.error(package.scala:27)
> [trace] Stack trace suppressed: run last compile:run for the full output.
> [error] (compile:run) Nonzero exit code: 1 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to