Neil Ramaswamy created SPARK-47362:
--------------------------------------
Summary: Enhance the console sink to provide watermark and state
information
Key: SPARK-47362
URL: https://issues.apache.org/jira/browse/SPARK-47362
Project: Spark
Issue Type: Improvement
Components: Structured Streaming
Affects Versions: 4.0.0
Reporter: Neil Ramaswamy
As [discussed in the dev mailing
list|https://lists.apache.org/thread/3wd23w0tldk3d8dbjznc11ybc1p3v3hh], we
should enhance the console sink for Structured Streaming to additionally
provide information about:
* The stream's watermark at the end of the batch
* The rows in state at the end of each batch
This will be enabled via an `option` on the sink. Since both of these additions
are for stateful queries only, the option will not affect stateless queries. To
make parsing the output easier, timestamps will be duration-rendered (i.e. "1
second" instead of the ISO 8601 extened timestamp). For joins, just the
KeyWithIndexToValue will be shown. If there are multiple stateful operators,
we'll print a state table for each.
These considerations are up to discussion either in this thread or in the PR.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]