I’m working on a feature of the Hive WebUI Query Plan tab that would
provide the option to display the query plan as a nice graph (scroll down
for screenshots). If you click on one of the graph’s stages, the plan for
that stage appears as text below.
Stages are color-coded if they have a status (Success, Error, Running), and
the rest are grayed out. Coloring is based on status already available in
the WebUI, under the Stages tab.
There is an additional option to display stats for MapReduce tasks. This
includes the job’s ID, tracking URL (where the logs are found), and mapper
and reducer numbers/progress, among other info.
The library I’m using for the graph is called vis.js (http://visjs.org/).
It has an Apache license, and the only necessary file to be included from
this library is about 700 KB.
I tried to keep server-side changes minimal, and graph generation is taken
care of by the client. Plans with more than a given number of stages
(default: 25) won't be displayed in order to preserve resources.
I’d love to hear any and all input from the community about this feature:
do you think it’s useful, and is there anything important I’m missing?
A completely successful query:
[image: Inline image 1]
A MapReduce task selected, with MapReduce stats view on:
[image: Inline image 2]
Full MapReduce stats, lacking some information because the query was run in
[image: Inline image 3]
A non-MapReduce stage selected:
[image: Inline image 4]
Last stage running:
[image: Inline image 5]
Last stage returns error:
[image: Inline image 6]