Hi,

Maybe you need to check those nodes. It's very slow.


3487    SUCCESS PROCESS_LOCAL   ip-10-60-150-111.ec2.internal   2013/12/01 
02:11:38     17.7 m  16.3 m  23.3 MB 
3447    SUCCESS PROCESS_LOCAL   ip-10-12-54-63.ec2.internal     2013/12/01 
02:11:26     20.1 m  13.9 m  50.9 MB 


> 在 2013年12月1日,上午10:59,"Mayuresh Kunjir" <mayuresh.kun...@gmail.com> 写道:
> 
> I tried passing DISK_ONLY storage level to Bagel's run method. It's running 
> without any error (so far) but is too slow. I am attaching details for a 
> stage corresponding to second iteration of my algorithm. (foreach at 
> Bagel.scala:237) It's been running for more than 35 minutes. I am noticing 
> very high GC time for some tasks. Listing below the setup parameters. 
> 
> #nodes = 16
> SPARK_WORKER_MEMORY = 13G
> SPARK_MEM = 13G
> RDD storage fraction = 0.5
> degree of parallelism = 192 (16 nodes * 4 cores each * 3)
> Serializer = Kryo
> Vertex data size after serialization = ~12G (probably too high, but it's the 
> bare minimum required for the algorithm.)
> 
> I would be grateful if you could suggest some further optimizations or point 
> out reasons why/if Bagel is not suitable for this data size. I need to 
> further scale my cluster and not feeling confident at all looking at this.
> 
> Thanks and regards,
> ~Mayuresh
> 
> 
>> On Sat, Nov 30, 2013 at 3:07 PM, Mayuresh Kunjir <mayuresh.kun...@gmail.com> 
>> wrote:
>> Hi Spark users,
>> 
>> I am running a pagerank-style algorithm on Bagel and bumping into "out of 
>> memory" issues with that. 
>> 
>> Referring to the following table, rdd_120 is the rdd of vertices, serialized 
>> and compressed in memory. On each iteration, Bagel deserializes the 
>> compressed rdd. e.g. rdd_126 shows the uncompressed version of rdd_120 
>> persisted in memory and disk. As iterations keep piling on, the cached 
>> partitions start getting evicted. The moment a rdd_120 partition gets 
>> evicted, it necessitates a recomputations and the performance goes for a 
>> toss. Although we don't need uncompressed rdds from previous iterations, 
>> they are the last ones to get evicted thanks to LRU policy. 
>> 
>> Should I make Bagel use DISK_ONLY persistence? How much of a performance hit 
>> would that be? Or maybe there is a better solution here.
>> 
>> Storage
>> RDD Name     Storage Level    Cached Partitions      Fraction Cached  Size 
>> in Memory Size on Disk
>> rdd_83        Memory Serialized1x Replicated  23      12%     83.7 MB        
>>  0.0 B
>> rdd_95        Memory Serialized1x Replicated  23     12%     2.5 MB   0.0 B
>> rdd_120       Memory Serialized1x Replicated  25      13%     761.1 MB       
>>  0.0 B
>> rdd_126       Disk Memory Deserialized 1x Replicated  192    100%    77.9 GB 
>>  1016.5 MB
>> rdd_134       Disk Memory Deserialized 1x Replicated  185     96%     60.8 
>> GB         475.4 MB
>> Thanks and regards,
>> ~Mayuresh
> 
> <BigFrame - Details for Stage 23.htm>

Reply via email to