[
https://issues.apache.org/jira/browse/SPARK-26135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
bjkonglu updated SPARK-26135:
-----------------------------
Environment:
h3.
was:
h3.
> Structured Streaming reporting metrics programmatically using asynchronous
> APIs can't get all queries metrics
> -------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-26135
> URL: https://issues.apache.org/jira/browse/SPARK-26135
> Project: Spark
> Issue Type: Improvement
> Components: Structured Streaming
> Affects Versions: 2.3.1
> Environment: h3.
>
>
> Reporter: bjkonglu
> Priority: Major
>
> h3. Background
> When I use Structured Streaming handle real-time data, I also want to know
> the streaming application metrics, for example
> prcessedRowsPerSecond、inputRowsPerSeconds etc. So I report metrics
> programmatically using asynchronous APIs.
> {code:java}
> val spark: SparkSession = ...
> spark.streams.addListener(new StreamingQueryListener() {
> override def onQueryStarted(queryStarted: QueryStartedEvent): Unit = {
> println("Query started: " + queryStarted.id)
> }
> override def onQueryTerminated(queryTerminated: QueryTerminatedEvent):
> Unit = {
> println("Query terminated: " + queryTerminated.id)
> }
> override def onQueryProgress(queryProgress: QueryProgressEvent): Unit = {
> println("Query made progress: " + queryProgress.progress)
> }
> })
> {code}
> h3. Questions
> When the streaming application has a single query, asynchronous APIs work
> well. But when the streaming application has many queries, asynchronous APIs
> can't report metrics exactly, some queries can report well, some queries
> report delay and metrics number lower.
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]