Re: Monitoring the User Metrics for a long running Spark Job

manish ranjan Mon, 05 Dec 2016 17:54:34 -0800

http://spark.apache.org/docs/latest/monitoring.html

You can even install tools like  dstat
<http://dag.wieers.com/home-made/dstat/>, iostat
<http://linux.die.net/man/1/iostat>, and iotop
<http://linux.die.net/man/1/iotop>, *collectd*  can provide fine-grained
profiling on individual nodes.

If you are using Mesos as Resource Manager , mesos exposes metrics as well
for the running job.

Manish

~Manish

On Mon, Dec 5, 2016 at 4:17 PM, Chawla,Sumit <sumitkcha...@gmail.com> wrote:

> Hi All
>
> I have a long running job which takes hours and hours to process data.
> How can i monitor the operational efficency of this job?  I am interested
> in something like Storm\Flink style User metrics/aggregators, which i can
> monitor while my job is running.  Using these metrics i want to monitor,
> per partition performance in processing items.  As of now, only way for me
> to get these metrics is when the job finishes.
>
> One possibility is that spark can flush the metrics to external system
> every few seconds, and thus use  an external system to monitor these
> metrics.  However, i wanted to see if the spark supports any such use case
> OOB.
>
>
> Regards
> Sumit Chawla
>
>

Re: Monitoring the User Metrics for a long running Spark Job

Reply via email to