Thanks for the link. Unfortunately, I turned on rdd compression and nothing
changed. I tried moving netty -> nio and no change :(

On Thu, Feb 26, 2015 at 2:01 AM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Not many that i know of, but i bumped into this one
> https://issues.apache.org/jira/browse/SPARK-4516
>
> Thanks
> Best Regards
>
> On Thu, Feb 26, 2015 at 3:26 PM, Victor Tso-Guillen <v...@paxata.com>
> wrote:
>
>> Is there any potential problem from 1.1.1 to 1.2.1 with shuffle
>> dependencies that produce no data?
>>
>> On Thu, Feb 26, 2015 at 1:56 AM, Victor Tso-Guillen <v...@paxata.com>
>> wrote:
>>
>>> The data is small. The job is composed of many small stages.
>>>
>>> * I found that with fewer than 222 the problem exhibits. What will be
>>> gained by going higher?
>>> * Pushing up the parallelism only pushes up the boundary at which the
>>> system appears to hang. I'm worried about some sort of message loss or
>>> inconsistency.
>>> * Yes, we are using Kryo.
>>> * I'll try that, but I'm again a little confused why you're recommending
>>> this. I'm stumped so might as well?
>>>
>>> On Wed, Feb 25, 2015 at 11:13 PM, Akhil Das <ak...@sigmoidanalytics.com>
>>> wrote:
>>>
>>>> What operation are you trying to do and how big is the data that you
>>>> are operating on?
>>>>
>>>> Here's a few things which you can try:
>>>>
>>>> - Repartition the RDD to a higher number than 222
>>>> - Specify the master as local[*] or local[10]
>>>> - Use Kryo Serializer (.set("spark.serializer",
>>>> "org.apache.spark.serializer.KryoSerializer"))
>>>> - Enable RDD Compression (.set("spark.rdd.compress","true") )
>>>>
>>>>
>>>> Thanks
>>>> Best Regards
>>>>
>>>> On Thu, Feb 26, 2015 at 10:15 AM, Victor Tso-Guillen <v...@paxata.com>
>>>> wrote:
>>>>
>>>>> I'm getting this really reliably on Spark 1.2.1. Basically I'm in
>>>>> local mode with parallelism at 8. I have 222 tasks and I never seem to get
>>>>> far past 40. Usually in the 20s to 30s it will just hang. The last logging
>>>>> is below, and a screenshot of the UI.
>>>>>
>>>>> 2015-02-25 20:39:55.779 GMT-0800 INFO  [task-result-getter-3]
>>>>> TaskSetManager - Finished task 3.0 in stage 16.0 (TID 22) in 612 ms on
>>>>> localhost (1/5)
>>>>> 2015-02-25 20:39:55.825 GMT-0800 INFO  [Executor task launch
>>>>> worker-10] Executor - Finished task 1.0 in stage 16.0 (TID 20). 2492 bytes
>>>>> result sent to driver
>>>>> 2015-02-25 20:39:55.825 GMT-0800 INFO  [Executor task launch worker-8]
>>>>> Executor - Finished task 2.0 in stage 16.0 (TID 21). 2492 bytes result 
>>>>> sent
>>>>> to driver
>>>>> 2015-02-25 20:39:55.831 GMT-0800 INFO  [task-result-getter-0]
>>>>> TaskSetManager - Finished task 1.0 in stage 16.0 (TID 20) in 670 ms on
>>>>> localhost (2/5)
>>>>> 2015-02-25 20:39:55.836 GMT-0800 INFO  [task-result-getter-1]
>>>>> TaskSetManager - Finished task 2.0 in stage 16.0 (TID 21) in 674 ms on
>>>>> localhost (3/5)
>>>>> 2015-02-25 20:39:55.891 GMT-0800 INFO  [Executor task launch worker-9]
>>>>> Executor - Finished task 0.0 in stage 16.0 (TID 19). 2492 bytes result 
>>>>> sent
>>>>> to driver
>>>>> 2015-02-25 20:39:55.896 GMT-0800 INFO  [task-result-getter-2]
>>>>> TaskSetManager - Finished task 0.0 in stage 16.0 (TID 19) in 740 ms on
>>>>> localhost (4/5)
>>>>>
>>>>> [image: Inline image 1]
>>>>> What should I make of this? Where do I start?
>>>>>
>>>>> Thanks,
>>>>> Victor
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to