[
https://issues.apache.org/jira/browse/SPARK-38234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karthik Subramanian updated SPARK-38234:
----------------------------------------
Description:
In [SPARK-31953|https://issues.apache.org/jira/browse/SPARK-31953] Structured
Streaming is added to the history server and a "Structured Streaming" tab
appears in the history UI when a streaming query is present. However, even
though a store exists for it and the data is presented in the UI, this data is
not exposed as a REST API. This data can be used for monitoring, detecting
streaming and to build custom dashboards. This monitoring API will be similar
to the monitoring APIs that are present for DStreams - refer
[SPARK-18470|https://issues.apache.org/jira/browse/SPARK-18470].
In this change, we plan to add two simple APIs that expose the data in the
store and can be used to monitor streaming queries.
h3. *Summary API*
To list the summary of all existing streaming queries.
GET {{/\{appId}/sql/streamingqueries}}
Response is list of {_}StreamingQueryData{_}.
h3. *Progress API*
To list the progress events of a specific streaming query by {_}runId{_}.
User can also specify how many of the most recent events needs to be retrieved
by using the _last_ query parameter. By default, we can return the most recent
progress event i.e. last is set to 1.
GET {{{}/\{appId}/sql/streamingqueries/\{runId}/progress?last={N{}}}}
Response is list of {_}StreamingQueryProgress{_}.
*Note:* We are not introducing new objects for the response since we are just
returning the data from the store without aggregation, these are existing event
structures.
Will attach sample I/O.
{{{{}}{}}}{{{{}}{}}}
was:
In [SPARK-31953 Add Spark Structured Streaming History Server Support - ASF
JIRA (apache.org)] Structured Streaming is added to the history server and a
"Structured Streaming" tab appears in the history UI when a streaming query is
present. However, even though a store exists for it and the data is presented
in the UI, this data is not exposed as a REST API. This data can be used for
monitoring, detecting streaming and to build custom dashboards. This monitoring
API will be similar to the monitoring APIs that are present for DStreams -
refer [SPARK-18470 Provide Spark Streaming Monitor Rest Api - ASF JIRA
(apache.org)].
In this change, we plan to add two simple APIs that expose the data in the
store and can be used to monitor streaming queries.
h3. *Summary API*
To list the summary of all existing streaming queries.
GET {{/\{appId}/sql/streamingqueries}}
Response is list of {_}StreamingQueryData{_}.
h3. *Progress API*
To list the progress events of a specific streaming query by {_}runId{_}.
User can also specify how many of the most recent events needs to be retrieved
by using the _last_ query parameter. By default, we can return the most recent
progress event i.e. last is set to 1.
GET {{{}/\{appId}/sql/streamingqueries/\{runId}/progress?last={N{}}}}
Response is list of {_}StreamingQueryProgress{_}.
*Note:* We are not introducing new objects for the response since we are just
returning the data from the store without aggregation, these are existing event
structures.
Will attach sample I/O.
{{{{}}{}}}{{{{}}{}}}
> Provide monitoring REST API for Structured Streaming
> ----------------------------------------------------
>
> Key: SPARK-38234
> URL: https://issues.apache.org/jira/browse/SPARK-38234
> Project: Spark
> Issue Type: Improvement
> Components: Structured Streaming
> Affects Versions: 3.3.0
> Reporter: Karthik Subramanian
> Priority: Major
>
> In [SPARK-31953|https://issues.apache.org/jira/browse/SPARK-31953] Structured
> Streaming is added to the history server and a "Structured Streaming" tab
> appears in the history UI when a streaming query is present. However, even
> though a store exists for it and the data is presented in the UI, this data
> is not exposed as a REST API. This data can be used for monitoring, detecting
> streaming and to build custom dashboards. This monitoring API will be similar
> to the monitoring APIs that are present for DStreams - refer
> [SPARK-18470|https://issues.apache.org/jira/browse/SPARK-18470].
> In this change, we plan to add two simple APIs that expose the data in the
> store and can be used to monitor streaming queries.
> h3. *Summary API*
> To list the summary of all existing streaming queries.
> GET {{/\{appId}/sql/streamingqueries}}
> Response is list of {_}StreamingQueryData{_}.
> h3. *Progress API*
> To list the progress events of a specific streaming query by {_}runId{_}.
> User can also specify how many of the most recent events needs to be
> retrieved by using the _last_ query parameter. By default, we can return the
> most recent progress event i.e. last is set to 1.
> GET {{{}/\{appId}/sql/streamingqueries/\{runId}/progress?last={N{}}}}
> Response is list of {_}StreamingQueryProgress{_}.
> *Note:* We are not introducing new objects for the response since we are just
> returning the data from the store without aggregation, these are existing
> event structures.
> Will attach sample I/O.
> {{{{}}{}}}{{{{}}{}}}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]