Re: [SQL] Why no numOutputRows metric for LocalTableScanExec in webUI?

2017-11-17 Thread Jacek Laskowski
Hi Ryan,

That does make a lot of sense! Thanks for steering me in a right direction.

Quoting SQLMetric [1]:

> Updates on the driver side must be explicitly posted using
SQLMetrics.postDriverMetricUpdates().

Why is LocalTableScanExec not following the "must"
requirement? FileSourceScanExec does (and so does BroadcastExchangeExec,
but that's not a data source so may have different reasons).

[1]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala#L31-L32

Pozdrawiam,
Jacek Laskowski

https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

On Fri, Nov 17, 2017 at 2:30 AM, Shixiong(Ryan) Zhu  wrote:

> SQL metrics are collected using SparkListener. If there are no
> tasks, org.apache.spark.sql.execution.ui.SQLListener cannot collect any
> metrics.
>
> On Thu, Nov 16, 2017 at 1:53 AM, Jacek Laskowski  wrote:
>
>> Hi,
>>
>> I seem to have figured out why the metric is not in the web UI for the
>> query, but wish I knew how to explain it for any metric and operator.
>>
>> It seems that numOutputRows metric won't be displayed in web UI when a
>> query uses no Spark jobs.
>>
>> val names = Seq("Jacek", "Agata").toDF("name")
>>
>> // no numOutputRows metric in web UI
>> names.show
>>
>> // The query gives numOutputRows metric in web UI's Details for Query
>> (SQL tab)
>> scala> names.groupBy(length($"name")).count.show
>>
>> That must be somewhat generic and I think has nothing to do with
>> LocalTableScanExec. Could anyone explain it in more detail? I'd appreciate.
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> 
>> https://about.me/JacekLaskowski
>> Spark Structured Streaming https://bit.ly/spark-structured-streaming
>> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>> On Wed, Nov 15, 2017 at 10:14 PM, Jacek Laskowski 
>> wrote:
>>
>>> Hi,
>>>
>>> I've been playing with LocalTableScanExec and noticed that it
>>> defines numOutputRows metric, but I couldn't find it in the diagram in web
>>> UI's Details for Query in SQL tab. Why?
>>>
>>> scala> spark.version
>>> res1: String = 2.3.0-SNAPSHOT
>>>
>>> scala> val hello = udf { s: String => s"Hello $s" }
>>> hello: org.apache.spark.sql.expressions.UserDefinedFunction =
>>> UserDefinedFunction(,StringType,Some(List(StringType)))
>>>
>>> scala> Seq("Jacek").toDF("name").select(hello($"name")).show
>>> +---+
>>> |  UDF(name)|
>>> +---+
>>> |Hello Jacek|
>>> +---+
>>>
>>> http://localhost:4040/SQL/execution/?id=0 shows no metrics for
>>> LocalTableScan. Is this intended?
>>>
>>> Pozdrawiam,
>>> Jacek Laskowski
>>> 
>>> https://about.me/JacekLaskowski
>>> Spark Structured Streaming https://bit.ly/spark-structured-streaming
>>> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
>>> Follow me at https://twitter.com/jaceklaskowski
>>>
>>
>>
>


Re: [SQL] Why no numOutputRows metric for LocalTableScanExec in webUI?

2017-11-16 Thread Shixiong(Ryan) Zhu
SQL metrics are collected using SparkListener. If there are no
tasks, org.apache.spark.sql.execution.ui.SQLListener cannot collect any
metrics.

On Thu, Nov 16, 2017 at 1:53 AM, Jacek Laskowski  wrote:

> Hi,
>
> I seem to have figured out why the metric is not in the web UI for the
> query, but wish I knew how to explain it for any metric and operator.
>
> It seems that numOutputRows metric won't be displayed in web UI when a
> query uses no Spark jobs.
>
> val names = Seq("Jacek", "Agata").toDF("name")
>
> // no numOutputRows metric in web UI
> names.show
>
> // The query gives numOutputRows metric in web UI's Details for Query (SQL
> tab)
> scala> names.groupBy(length($"name")).count.show
>
> That must be somewhat generic and I think has nothing to do with
> LocalTableScanExec. Could anyone explain it in more detail? I'd appreciate.
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://about.me/JacekLaskowski
> Spark Structured Streaming https://bit.ly/spark-structured-streaming
> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
> On Wed, Nov 15, 2017 at 10:14 PM, Jacek Laskowski  wrote:
>
>> Hi,
>>
>> I've been playing with LocalTableScanExec and noticed that it
>> defines numOutputRows metric, but I couldn't find it in the diagram in web
>> UI's Details for Query in SQL tab. Why?
>>
>> scala> spark.version
>> res1: String = 2.3.0-SNAPSHOT
>>
>> scala> val hello = udf { s: String => s"Hello $s" }
>> hello: org.apache.spark.sql.expressions.UserDefinedFunction =
>> UserDefinedFunction(,StringType,Some(List(StringType)))
>>
>> scala> Seq("Jacek").toDF("name").select(hello($"name")).show
>> +---+
>> |  UDF(name)|
>> +---+
>> |Hello Jacek|
>> +---+
>>
>> http://localhost:4040/SQL/execution/?id=0 shows no metrics for
>> LocalTableScan. Is this intended?
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> 
>> https://about.me/JacekLaskowski
>> Spark Structured Streaming https://bit.ly/spark-structured-streaming
>> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>
>


Re: [SQL] Why no numOutputRows metric for LocalTableScanExec in webUI?

2017-11-16 Thread Jacek Laskowski
Hi,

I seem to have figured out why the metric is not in the web UI for the
query, but wish I knew how to explain it for any metric and operator.

It seems that numOutputRows metric won't be displayed in web UI when a
query uses no Spark jobs.

val names = Seq("Jacek", "Agata").toDF("name")

// no numOutputRows metric in web UI
names.show

// The query gives numOutputRows metric in web UI's Details for Query (SQL
tab)
scala> names.groupBy(length($"name")).count.show

That must be somewhat generic and I think has nothing to do with
LocalTableScanExec. Could anyone explain it in more detail? I'd appreciate.

Pozdrawiam,
Jacek Laskowski

https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

On Wed, Nov 15, 2017 at 10:14 PM, Jacek Laskowski  wrote:

> Hi,
>
> I've been playing with LocalTableScanExec and noticed that it
> defines numOutputRows metric, but I couldn't find it in the diagram in web
> UI's Details for Query in SQL tab. Why?
>
> scala> spark.version
> res1: String = 2.3.0-SNAPSHOT
>
> scala> val hello = udf { s: String => s"Hello $s" }
> hello: org.apache.spark.sql.expressions.UserDefinedFunction =
> UserDefinedFunction(,StringType,Some(List(StringType)))
>
> scala> Seq("Jacek").toDF("name").select(hello($"name")).show
> +---+
> |  UDF(name)|
> +---+
> |Hello Jacek|
> +---+
>
> http://localhost:4040/SQL/execution/?id=0 shows no metrics for
> LocalTableScan. Is this intended?
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://about.me/JacekLaskowski
> Spark Structured Streaming https://bit.ly/spark-structured-streaming
> Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>


[SQL] Why no numOutputRows metric for LocalTableScanExec in webUI?

2017-11-15 Thread Jacek Laskowski
Hi,

I've been playing with LocalTableScanExec and noticed that it
defines numOutputRows metric, but I couldn't find it in the diagram in web
UI's Details for Query in SQL tab. Why?

scala> spark.version
res1: String = 2.3.0-SNAPSHOT

scala> val hello = udf { s: String => s"Hello $s" }
hello: org.apache.spark.sql.expressions.UserDefinedFunction =
UserDefinedFunction(,StringType,Some(List(StringType)))

scala> Seq("Jacek").toDF("name").select(hello($"name")).show
+---+
|  UDF(name)|
+---+
|Hello Jacek|
+---+

http://localhost:4040/SQL/execution/?id=0 shows no metrics for
LocalTableScan. Is this intended?

Pozdrawiam,
Jacek Laskowski

https://about.me/JacekLaskowski
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski