http://spark.apache.org/docs/latest/monitoring.html
You can even install tools like dstat <http://dag.wieers.com/home-made/dstat/>, iostat <http://linux.die.net/man/1/iostat>, and iotop <http://linux.die.net/man/1/iotop>, *collectd* can provide fine-grained profiling on individual nodes. If you are using Mesos as Resource Manager , mesos exposes metrics as well for the running job. Manish ~Manish On Mon, Dec 5, 2016 at 4:17 PM, Chawla,Sumit <sumitkcha...@gmail.com> wrote: > Hi All > > I have a long running job which takes hours and hours to process data. > How can i monitor the operational efficency of this job? I am interested > in something like Storm\Flink style User metrics/aggregators, which i can > monitor while my job is running. Using these metrics i want to monitor, > per partition performance in processing items. As of now, only way for me > to get these metrics is when the job finishes. > > One possibility is that spark can flush the metrics to external system > every few seconds, and thus use an external system to monitor these > metrics. However, i wanted to see if the spark supports any such use case > OOB. > > > Regards > Sumit Chawla > >