subject:"Re\: spark.akka.frameSize stalls job in 1.1.0"

Re: spark.akka.frameSize stalls job in 1.1.0

2014-08-18 Thread Zhan Zhang

Is it because countByValue or toArray put too much stress on the driver, if there are many unique words To me it is a typical word count problem, then you can solve it as follows (correct me if I am wrong) val textFile = sc.textFile(“file) val counts = textFile.flatMap(line = line.split(

Re: spark.akka.frameSize stalls job in 1.1.0

2014-08-18 Thread Jerry Ye

Hi Zhan, Thanks for looking into this. I'm actually using the hash map as an example of the simplest snippet of code that is failing for me. I know that this is just the word count. In my actual problem I'm using a Trie data structure to find substring matches. On Sun, Aug 17, 2014 at 11:35 PM,

Re: spark.akka.frameSize stalls job in 1.1.0

2014-08-18 Thread Zhan Zhang

Not sure exactly how you use it. My understanding is that in spark it would be better to keep the overhead of driver as less as possible. Is it possible to broadcast trie to executors, do computation there and then aggregate the counters (??) in reduct phase? Thanks. Zhan Zhang On Aug 18,

Re: spark.akka.frameSize stalls job in 1.1.0

2014-08-16 Thread Jerry Ye

Hi Xiangrui, I actually tried branch-1.1 and master and it resulted in the job being stuck at the TaskSetManager: 14/08/16 06:55:48 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 with 2 tasks 14/08/16 06:55:48 INFO scheduler.TaskSetManager: Starting task 1.0:0 as TID 2 on executor 8:

Re: spark.akka.frameSize stalls job in 1.1.0

2014-08-15 Thread jerryye

Hi Xiangrui, I wasn't setting spark.driver.memory. I'll try that and report back. In terms of this running on the cluster, my assumption was that calling foreach on an array(I converted samples using toArray) would mean counts is propagated locally. The object would then be serialized to

Re: spark.akka.frameSize stalls job in 1.1.0

2014-08-15 Thread jerryye

Setting spark.driver.memory has no effect. It's still hanging trying to compute result.count when I'm sampling greater than 35% regardless of what value of spark.driver.memory I'm setting. Here's my settings: export SPARK_JAVA_OPTS=-Xms5g -Xmx10g -XX:MaxPermSize=10g export SPARK_MEM=10g in

Re: spark.akka.frameSize stalls job in 1.1.0

Re: spark.akka.frameSize stalls job in 1.1.0

Re: spark.akka.frameSize stalls job in 1.1.0

Re: spark.akka.frameSize stalls job in 1.1.0

Re: spark.akka.frameSize stalls job in 1.1.0

Re: spark.akka.frameSize stalls job in 1.1.0

6 matches

Site Navigation

Mail list logo

Footer information