[
https://issues.apache.org/jira/browse/SPARK-47017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric Yang updated SPARK-47017:
------------------------------
Attachment: eventLogs-local-1708032228180.zip
> Show metrics of the physical plan of RDDScanExec's internal RDD in the
> history server
> -------------------------------------------------------------------------------------
>
> Key: SPARK-47017
> URL: https://issues.apache.org/jira/browse/SPARK-47017
> Project: Spark
> Issue Type: New Feature
> Components: Web UI
> Affects Versions: 3.4.0, 3.5.0
> Reporter: Eric Yang
> Priority: Major
> Attachments: ScanExistingRDD.jpg, eventLogs-local-1708032228180.zip,
> simple2.scala
>
>
> The RDDScanExec wraps an internal RDD (as below). In our environment, we find
> that this RDD is usually produced by some very large physical plans which
> contain quite a few physical nodes. Those nodes may have various metrics
> which are very useful for us to know what the execution looks like and any
> room for optimization, etc.
>
> {code:java}
> case class RDDScanExec(
> output: Seq[Attribute],
> rdd: RDD[InternalRow], <-- this field
> name: String, {code}
>
> However, the physical plan and the metrics are invisible from the SQL DAG in
> the Spark History Server. As it is an "existing RDD", the physical plan may
> be found from some previous SQL. The metrics are not visible from that
> previous SQL either. This is because the "definition" of these metrics are
> reported along with the SparkListenerSQLExecutionStart event of the "previous
> SQL" (where the physical plan of the RDDScanExec.rdd is in), but the metric
> values are reported from the SparkListenerTaskEnd event of the tasks which
> are attached to the SQL with RDDScanExec.
> !ScanExistingRDD.jpg|width=336,height=296!
>
> Do we consider showing the physical plan and metrics of the RDDScanExec.rdd
> (the "Scan Existing RDD" node in the above DAG). For example, it may be shown
> as a "leg" (similar to but not the same as a child) in the DAG, or something
> else that may show the physical plan and metrics?
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]