Some thing like Twitter Ambrose would be lovely to integrate :)
Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Thu, May 1, 2014 at 8:44 PM, Punya Biswal <pbis...@palantir.com> wrote: > Hi all, > > I am thinking of starting work on a profiler for Spark clusters. The > current idea is that it would collect jstacks from executor nodes and put > them into a central index (either a database or elasticsearch), and it > would present them to people in a UI that would let people slice and dice > the jstacks based on what job was running at the time, and what executor > node was running. In addition, the UI would also present time spent doing > non-computational work, such as shuffling and input/output IO. In a future > extension, we might support reading from JMX and/or a JVM agent to get more > precise data. > > I know that it's already possible to use YourKit to profile individual > processes, but YourKit costs money, needs a desktop client to be installed, > and doesn't place its data in the context relevant to a Spark cluster. > > Does something like this already exist (or is such a project already in > progress)? Do you have any feedback or recommendations for how to go about > it? > > Thanks! > Punya > >