I've had very good success troubleshooting this type of thing by using the
Spark Web UI, which will depict a breakdown of all tasks. This also
includes the RDDs being used, as well as any cached data. Additional
information about this tool can be found at
http://spark.apache.org/docs/latest/monitoring.html.

On Thu, Mar 10, 2016 at 1:31 PM, souri datta <souri.isthe...@gmail.com>
wrote:

> Hi,
>  Currently I am trying to optimize my spark application and in that
> process, I am trying to figure out if at any stage in the code, I am
> recomputing a large RDD (so that I can optimize it by
> persisting/checkpointing it).
>
> Is there any indication in the event logs that tells us about an RDD being
> computed?
> If anyone has done similar analysis, can you please share how you went
> about it.
>
> Thanks in advance,
> Souri
>

Reply via email to