hvanhovell commented on a change in pull request #26127: [SPARK-29348][SQL] Add
observable Metrics for Streaming queries
URL: https://github.com/apache/spark/pull/26127#discussion_r361949909
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala
##########
@@ -106,6 +106,9 @@ class QueryExecution(
lazy val toRdd: RDD[InternalRow] = new SQLExecutionRDD(
executedPlan.execute(), sparkSession.sessionState.conf)
+ /** Get the metrics observed during the execution of the query plan. */
+ def observedMetrics: Map[String, Row] =
CollectMetricsExec.collect(executedPlan)
Review comment:
I am not sure what the issue is?
By definition anything in `QueryExecution` is internal, unstable, API. The
reason that I added it here is that is a good narrow waist to add this, you
want to collect the metrics for the entire query. It is public because most
other methods in this class are public; this allows for some clever
integrations (for the more adventurous developer) and makes debugging easier.
The batch listener is marked as experimental. A developer should be warned
when (s)he uses this API for anything (including using it to collect observable
metrics). I am not sure how realistic stabilizing the batch listener API is if
you include `QueryExecution` (or a stabilized version of it). We could expose a
stable callback, e,g. `onObservedMetrics(...)`, for observed metrics. On the
other hand is kind of annoying that we can't expose this through the Dataframe
itself, and that is because when we execute the Dataframe we often use a
different one under the hood.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]