planga82 commented on a change in pull request #25598: [SPARK-28542][DOCS][WebUI] Stages Tab URL: https://github.com/apache/spark/pull/25598#discussion_r319269676
########## File path: docs/web-ui.md ########## @@ -94,9 +94,76 @@ This page displays the details of a specific job identified by its job ID. </p> ## Stages Tab + The Stages tab displays a summary page that shows the current state of all stages of all jobs in -the Spark application, and, when you click on a stage, a details page for that stage. The details -page shows the event timeline, DAG visualization, and all tasks for the stage. +the Spark application. + +At the beginning of the page is the summary with the count of all stages by status (active, pending, completed, sikipped, and failed) + +<p style="text-align: center;"> + <img src="img/AllStagesPageDetail1.png" title="Stages header" alt="Stages header" width="30%"> +</p> + +In [Fair scheduling mode](job-scheduling.html#scheduling-within-an-application) there is a table that displays [pools properties](job-scheduling.html#configuring-pool-properties) + +<p style="text-align: center;"> + <img src="img/AllStagesPageDetail2.png" title="Pool properties" alt="Pool properties"> +</p> + +After that are the details of stages per status (active, pending, completed,skipped, failed). In active stages, it's possible to kill the stage with the kill button. Only in failure stages, failure reason is shown. There is access to the task detail by clicking on the description. + +<p style="text-align: center;"> + <img src="img/AllStagesPageDetail3.png" title="Stages detail" alt="Stages detail"> +</p> + +### Stage detail +The summary is at the beginning of the page with information like Total time across all tasks, [Locality level summary](tuning.html#data-locality) , [Shuffle Read Size / Records](rdd-programming-guide.html#shuffle-operations) and Associated Job Ids. + +<p style="text-align: center;"> + <img src="img/AllStagesPageDetail4.png" title="Stage header" alt="Stage header" width="30%"> +</p> + +There is also the visual representatión of the directed acyclic graph (DAG) of this stage, where vertices represent the RDDs or DataFrames and the edges represent an operation to be applied + +<p style="text-align: center;"> + <img src="img/AllStagesPageDetail5.png" title="Stage DAG" alt="Stage DAG" width="50%"> +</p> + +Summary metrics for all task are represented in a table and in a timeline +* **[Tasks deserialization time](configuration.html#compression-and-serialization)** +* **Duration of tasks** +* **GC time** +* **Result serialization time** is the time spent serializing the task result on a executor before sending it back to the driver +* **Getting result time** is the time that the driver spends fetching task results from workers +* **Scheduler delay** includes the time to ship the task from the scheduler to executors, and the time to send the task result from the executors to the scheduler +* **Peak execution memory** is the sum of the peak sizes of the internal data structures created during shuffles, aggregations and joins. +* **Shuffle Read Size / Records** +* **Shuffle Read Blocked Time** is the time that tasks spent blocked waiting for shuffle data to be read from remote machines +* **Shuffle Remote Reads** is the total shuffle bytes read from remote executors Review comment: I have updated the description. Shuffle Read Size include data read locally and from the remote executors. Shuffle Remote Reads include only data readed from remote executors. I'm wrong? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
