I also encountered the similar problem: after some stages, all the taskes
are assigned to one machine, and the stage execution get slower and slower.
*[the spark conf setting]*
val conf = new SparkConf().setMaster(sparkMaster).setAppName(ModelTraining
).setSparkHome(sparkHome).setJars(List(jarFile))
conf.set(spark.serializer, org.apache.spark.serializer.KryoSerializer)
conf.set(spark.kryo.registrator, LRRegistrator)
conf.set(spark.storage.memoryFraction, 0.7)
conf.set(spark.executor.memory, 8g)
conf.set(spark.cores.max, 150)
conf.set(spark.speculation, true)
conf.set(spark.storage.blockManagerHeartBeatMs, 30)
val sc = new SparkContext(conf)
val lines = sc.textFile(hdfs://xxx:52310+inputPath , 3)
val trainset = lines.map(parseWeightedPoint).repartition(50
).persist(StorageLevel.MEMORY_ONLY)
*[the warn log from the spark]*
14/09/19 10:26:23 WARN TaskSetManager: Loss was due to fetch failure from
BlockManagerId(45, TS-BH109, 48384, 0)
14/09/19 10:27:18 WARN TaskSetManager: Lost TID 726 (task 14.0:9)
14/09/19 10:29:03 WARN SparkDeploySchedulerBackend: Ignored task status
update (737 state FAILED) from unknown executor
Actor[akka.tcp://sparkExecutor@TS-BH96:33178/user/Executor#-913985102] with
ID 39
14/09/19 10:29:03 WARN TaskSetManager: Loss was due to fetch failure from
BlockManagerId(30, TS-BH136, 28518, 0)
14/09/19 11:01:22 WARN BlockManagerMasterActor: Removing BlockManager
BlockManagerId(47, TS-BH136, 31644, 0) with no recent heart beats: 47765ms
exceeds 45000ms
Any suggestions?
On Thu, Sep 18, 2014 at 4:46 PM, shishu shi...@zamplus.com wrote:
Hi dear all~
My spark application sometimes runs much slower than it use to be, so I
wonder why would this happen.
I find out that after a repartition stage of stage 17, all tasks go to one
executor. But in my code, I only use repartition at the very beginning.
In my application, before stage 17, every stage run sucessfully within 1
minute, but after stage 17, it cost more than 10 minutes for every stage.
Normally my application runs succcessfully and will finish within 9 minites.
My spark version is 0.9.1, and my program is writen by scala.
I take some screen-shots, you can see it in the archive.
Great thanks if you can help~
Shi Shu
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org