I think it would be very useful to have this. We could put the ui display either behind a flag or a url parameter
Shivaram On Wed, Jul 9, 2014 at 12:25 PM, Reynold Xin <r...@databricks.com> wrote: > Maybe it's time to create an advanced mode in the ui. > > > On Wed, Jul 9, 2014 at 12:23 PM, Kay Ousterhout <k...@eecs.berkeley.edu> > wrote: > > > Hi all, > > > > I've been doing a bunch of performance measurement of Spark and, as part > of > > doing this, added metrics that record the average CPU utilization, disk > > throughput and utilization for each block device, and network throughput > > while each task is running. These metrics are collected by reading the > > /proc filesystem so work only on Linux. I'm happy to submit a pull > request > > with the appropriate changes but first wanted to see if sufficiently many > > people think this would be useful. I know the metrics reported by Spark > > (and in the UI) are already overwhelming to some folks so don't want to > add > > more instrumentation if it's not widely useful. > > > > These metrics are slightly more difficult to interpret for Spark than > > similar metrics reported by Hadoop because, with Spark, multiple tasks > run > > in the same JVM and therefore as part of the same process. This means > > that, for example, the CPU utilization metrics reflect the CPU use across > > all tasks in the JVM, rather than only the CPU time used by the > particular > > task. This is a pro and a con -- it makes it harder to determine why > > utilization is high (it may be from a different task) but it also makes > the > > metrics useful for diagnosing straggler problems. Just wanted to clarify > > this before asking folks to weigh in on whether the added metrics would > be > > useful. > > > > -Kay > > > > (if you're curious, the instrumentation code is on a very messy branch > > here: > > > > > https://github.com/kayousterhout/spark-1/tree/proc_logging_perf_minimal_temp/core/src/main/scala/org/apache/spark/performance_logging > > ) > > >