Hi Ryan, That does make a lot of sense! Thanks for steering me in a right direction.
Quoting SQLMetric [1]: > Updates on the driver side must be explicitly posted using SQLMetrics.postDriverMetricUpdates(). Why is LocalTableScanExec not following the "must" requirement? FileSourceScanExec does (and so does BroadcastExchangeExec, but that's not a data source so may have different reasons). [1] https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala#L31-L32 Pozdrawiam, Jacek Laskowski ---- https://about.me/JacekLaskowski Spark Structured Streaming https://bit.ly/spark-structured-streaming Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Fri, Nov 17, 2017 at 2:30 AM, Shixiong(Ryan) Zhu <shixi...@databricks.com > wrote: > SQL metrics are collected using SparkListener. If there are no > tasks, org.apache.spark.sql.execution.ui.SQLListener cannot collect any > metrics. > > On Thu, Nov 16, 2017 at 1:53 AM, Jacek Laskowski <ja...@japila.pl> wrote: > >> Hi, >> >> I seem to have figured out why the metric is not in the web UI for the >> query, but wish I knew how to explain it for any metric and operator. >> >> It seems that numOutputRows metric won't be displayed in web UI when a >> query uses no Spark jobs. >> >> val names = Seq("Jacek", "Agata").toDF("name") >> >> // no numOutputRows metric in web UI >> names.show >> >> // The query gives numOutputRows metric in web UI's Details for Query >> (SQL tab) >> scala> names.groupBy(length($"name")).count.show >> >> That must be somewhat generic and I think has nothing to do with >> LocalTableScanExec. Could anyone explain it in more detail? I'd appreciate. >> >> Pozdrawiam, >> Jacek Laskowski >> ---- >> https://about.me/JacekLaskowski >> Spark Structured Streaming https://bit.ly/spark-structured-streaming >> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark >> Follow me at https://twitter.com/jaceklaskowski >> >> On Wed, Nov 15, 2017 at 10:14 PM, Jacek Laskowski <ja...@japila.pl> >> wrote: >> >>> Hi, >>> >>> I've been playing with LocalTableScanExec and noticed that it >>> defines numOutputRows metric, but I couldn't find it in the diagram in web >>> UI's Details for Query in SQL tab. Why? >>> >>> scala> spark.version >>> res1: String = 2.3.0-SNAPSHOT >>> >>> scala> val hello = udf { s: String => s"Hello $s" } >>> hello: org.apache.spark.sql.expressions.UserDefinedFunction = >>> UserDefinedFunction(<function1>,StringType,Some(List(StringType))) >>> >>> scala> Seq("Jacek").toDF("name").select(hello($"name")).show >>> +-----------+ >>> | UDF(name)| >>> +-----------+ >>> |Hello Jacek| >>> +-----------+ >>> >>> http://localhost:4040/SQL/execution/?id=0 shows no metrics for >>> LocalTableScan. Is this intended? >>> >>> Pozdrawiam, >>> Jacek Laskowski >>> ---- >>> https://about.me/JacekLaskowski >>> Spark Structured Streaming https://bit.ly/spark-structured-streaming >>> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark >>> Follow me at https://twitter.com/jaceklaskowski >>> >> >> >