GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/1139
[FLINK-2357] Add the new web dashboard and monitoring REST API This pull request adds two new big features: The monitoring REST API and the web dashboard. It is joint work by @iampeter and me (although @iampeter did all the stellar UI wonders, kudos!) ## Web dashboard The web dashboard is a frontend implemented against the monitoring REST API (see below). Compared to the old web frontend: - It combines views about optimizer plan, runtime dataflow, details about vertices and subtasks - Has better overview pages that show more information at a glance - Shows updating metrics (records, bytes) - Shows updating user accumulators - Timeline decoupled from online-only library - Shows more information about status, exceptions, config - Is more structured and maintainable - Has a nicer design (personal opinion) Here is are some teaser screenshots, showing the overview, job dataflow view, and timeline. Clicking on the nodes shows the details about subtasks, etc. ![bild1](https://cloud.githubusercontent.com/assets/1727146/9918194/bcf569da-5cc5-11e5-817c-0d6d5053e324.png) ![bild2](https://cloud.githubusercontent.com/assets/1727146/9918196/beccacfa-5cc5-11e5-89f4-b26ffbdf2ef9.png) ![bild3](https://cloud.githubusercontent.com/assets/1727146/9918199/c3b12e6c-5cc5-11e5-87fd-a3b3c431ffd6.png) To try it out, make sure you have this entry in the `conf/flink-conf.yaml` file: `jobmanager.new-web-frontend: true` The dashboard is not complete, yet, it lacks for example TaskManager details and log file access. Before removing the old web frontend, these missing aspects need to be added. ## Monitoring API The monitoring API is a REST API that allows you to issue HTTP requests like `http://jobmanager:8081/jobs/7684be6004e4e955c2a558a9bc463f65` to query the status of jobs. The responses are JSON encoded. It is used by the web dashboard, but can be used to realize custom monitoring tools. There is a detailed doc describing the capabilities (`docs/internals/monitoring_rest_api.md`), here is a summary list of supported requests: - `/config` - `/overview` - `/jobs` - `/joboverview` - `/joboverview/running` - `/joboverview/completed` - `/jobs/<jobid>` - `/jobs/<jobid>/config` - `/jobs/<jobid>/exceptions` - `/jobs/<jobid>/accumulators` - `/jobs/<jobid>/vertices` - `/jobs/<jobid>/vertices/<vertexid>` - `/jobs/<jobid>/vertices/<vertexid>/subtasktimes` - `/jobs/<jobid>/vertices/<vertexid>/accumulators` - `/jobs/<jobid>/vertices/<vertexid>/subtasks/accumulators` - `/jobs/<jobid>/vertices/<vertexid>/subtasks/<subtasknum>` - `/jobs/<jobid>/vertices/<vertexid>/subtasks/<subtasknum>/attempts/<attempt>` - `/jobs/<jobid>/vertices/<vertexid>/subtasks/<subtasknum>/attempts/<attempt>/accumulators` - `/jobs/<jobid>/plan` You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink dashboard Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1139.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1139 ---- commit 170542314375b4edeef266eb36935ec6da2d261a Author: Stephan Ewen <se...@apache.org> Date: 2015-08-18T09:04:48Z [FLINK-2415] [optimizer] Create and attach proper JobGraph describing JSON plans to JobGraph for batch jobs. commit 069590bfbf1d340548e9dd98a625f46b8d7c8b92 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-18T12:31:37Z [FLINK-2547] [web dashboard] Improved timeline view commit da11c46ca496549522fce93ccfc14a11611541b7 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-19T12:46:41Z [FLINK-2547] [web dashboard] Updated web dashboard after request/response changes commit 617b1a465f9e32997c01e8966a84cccea724ae54 Author: Stephan Ewen <se...@apache.org> Date: 2015-08-19T14:32:30Z [FLINK-2547] [web dashboard] Add handlers for cluster status and web dashboard configuration commit cf6743c3f70493ad05b1ee7a5e2ca3dfb4f1bf35 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-19T20:54:38Z [FLINK-2357] [web dashboard] Added the 'Scheduled' mark to timeline commit 691f9e644824a9520ade703f6266ac4c25a5ce69 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-19T21:19:34Z [FLINK-2357] [web dashboard] Added tooltips to timeline commit 40b900882585025af2167b138112ec9fd7514b92 Author: Stephan Ewen <se...@apache.org> Date: 2015-08-20T12:48:13Z [FLINK-2415] [web dashboard] Provide more data in the job overview responses Rather than only listing job ids, this now includes all information necessary to describe the job. This reduces the number of requests for the job overview page to a single request. commit 56d51942c206519b81b9583cf806bb4ac3845927 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-20T15:06:57Z [FLINK-2357] [web dashboard] Changed overview and timeline commit f645964539f7f9ff060606d3665fabc31eae1a23 Author: Stephan Ewen <se...@apache.org> Date: 2015-08-20T15:06:16Z [FLINK-2554] [web dashboard] Add request hander that lists exceptions encountered during execution commit 24a8b3f9907f087ca2ea060b4a79143a62c5e435 Author: Stephan Ewen <se...@apache.org> Date: 2015-08-20T21:27:50Z [FLINK-2415] [monitoring api] Add vertex details request handler, unify IDs between vertices and plan commit 1700d2ed18ceff816617c02ebf5054477792b8ea Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-21T12:58:38Z [FLINK-2357] [web dashboard] Auto-update overview page commit b68b82b0dea3a48c8f725d7c240f98cc7919ac90 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-21T15:42:13Z [FLINK-2357] [web dashboard] Add view for exceptions commit 70adaf12e1d9ea4b0a62635de6bd0ac7e4360303 Author: Stephan Ewen <se...@apache.org> Date: 2015-08-21T16:19:24Z [FLINK-2687] [monitoring API] Extend vertex requests with subtask data and accumulators commit 72b34db83d95ff3dedae82931158a5114dfac02e Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-27T19:31:13Z [FLINK-2357] [web dashboard] Extend exceptions view commit 7a961b09ee521c2fc875795b2cac52e357b358d9 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-27T19:56:21Z [FLINK-2357] [web dashboard] Added status counts for a job commit 1669a6808cbf95a2677dd402b4ec9d3351bc21f1 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-27T21:25:13Z [FLINK-2357] [web dashboard] Timeline for running tasks commit 3abde0a1a5e41fe7a8e563edb3dc178252c642b6 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-08-31T22:01:34Z [FLINK-2357] [web dashboard] New node organization commit e6e4bae6e2b94df00aa22548c8621cd5e0a4d4d8 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-09-01T14:18:35Z [FLINK-2357] [web dashboard] Show plan (and optimizer properties) as a dedicated view commit 3a6e7667ab26fd0e5fecebe30b03de78a9f79f90 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-09-09T17:36:11Z [FLINK-2357] [web dashboard] Adjust view for details of a job commit c5127816d0d457ea3a0010de9b85ed77c92a698e Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-09-09T22:18:12Z [FLINK-2357] [web dashboard] Add auto-refresh to dashboard commit 63e6533ee6bef0de234b388405e5e53c0a6634a9 Author: Stephan Ewen <se...@apache.org> Date: 2015-09-10T13:47:28Z [FLINK-2687] [monitoring api] Add handlers for subtask details and accumulators commit e819e6be91e7907fd00737253ccad288daae58d2 Author: Piotr Godek <piotr.go...@gmail.com> Date: 2015-09-11T20:37:13Z [FLINK-2357] [web dashboard] Add subtasks to vertex display commit 3225be3e19e422001668508f9ee4732b93ded461 Author: Stephan Ewen <se...@apache.org> Date: 2015-09-16T17:23:07Z [FLINK-2688] [monitoring api] Add docs for monitoring REST API. commit 30cfeab3cc1c10779f0f8a79263be2fa5beb069e Author: Stephan Ewen <se...@apache.org> Date: 2015-09-16T20:04:19Z [FLINK-2688] [monitoring api] Integrate monitoring request handler with HA leader handling commit 15ee4ffa7b2307ce30575d37ace14d5db4bfd4b6 Author: Stephan Ewen <se...@apache.org> Date: 2015-09-16T20:38:40Z Revert "[FLINK-2605] [runtime] Unclosed RandomAccessFile may leak resource in StaticFileServerHandler" The change breaks the functionality by returning incorrect HTTP responses (length / contents mismatch). ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---