Hi

Thanks for your reply.

It will be very helpful if you could elaborate your ideas on
spark.locality.wait and multiple locality levels (process-local,
node-local, rack-local and then any) and what is the best configuration i
can achieve by modifying this wait and what is the difference between
process local and node local.

Regards
Vinay Bajaj


On Wed, Feb 12, 2014 at 2:19 PM, Guillaume Pitel <guillaume.pi...@exensa.com
> wrote:

>  Hi
>
>
>  I am attaching the Spark process web Info screenshot, have a look at
> screenshot.
>
>  1) For A single Map operator why it shows multiple complete Stages, with
> same information.
>
> If you don't cache your result and it's needed several time in the
> computation, Spark recomputes the Map, and thus it appears several times.
>
>    2) As you can see the Number of Complete workers is more than Maximum
> workers (2931/2339). Can you please tell me why it shows like that ??
>
> Usually it happens when one of your Executor dies (usually from serious
> memory exhaustion, but many causes can be found)
> Only advice I can give is to watch your logs for ERROR and Exception
>
>   3) How a stage is designed in spark As you can see my code After first
> Map with groupByKey and filter I am running one more Map then filter then
> Count But this spark Combined these three stages and Named it as Count (you
> can see in ScreenShot attached). Can you please explain How does it combine
> stages and what is the logic or idea behind this??
>
>   I'll let someone else answer you on that, but basically, you can trust
> Spark to optimize this correctly.
>
> Guillaume
> --
>    [image: eXenSa]
>  *Guillaume PITEL, Président*
> +33(0)6 25 48 86 80
>
> eXenSa S.A.S. <http://www.exensa.com/>
>  41, rue Périer - 92120 Montrouge - FRANCE
> Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05
>

<<inline: exensa_logo_mail.png>>

Reply via email to