Okay, so out of 164 stages, is 163 are skipped. And how 41405 tasks are
skipped if the total is only 19788.

On Wed, Mar 16, 2016 at 6:31 AM, Mark Hamstra <m...@clearstorydata.com>
wrote:

> It's not just if the RDD is explicitly cached, but also if the map outputs
> for stages have been materialized into shuffle files and are still
> accessible through the map output tracker.  Because of that, explicitly
> caching RDD actions often gains you little or nothing, since even without a
> call to cache() or persist() the prior computation will largely be reused
> and stages will show up as skipped -- i.e. no need to recompute that stage.
>
> On Tue, Mar 15, 2016 at 5:50 PM, Jeff Zhang <zjf...@gmail.com> wrote:
>
>> If RDD is cached, this RDD is only computed once and the stages for
>> computing this RDD in the following jobs are skipped.
>>
>>
>> On Wed, Mar 16, 2016 at 8:14 AM, Prabhu Joseph <
>> prabhujose.ga...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>>
>>> Spark UI Completed Jobs section shows below information, what is the
>>> skipped value shown for Stages and Tasks below.
>>>
>>> Job_ID    Description    Submitted                    Duration
>>> Stages (Succeeded/Total)    Tasks (for all stages): Succeeded/Total
>>>
>>> 11             count          2016/03/14 15:35:32      1.4
>>> min             164/164 * (163 skipped)   *            19841/19788
>>> *(41405 skipped)*
>>> Thanks,
>>> Prabhu Joseph
>>>
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>

Reply via email to