Neil Ramaswamy created SPARK-47362:
--------------------------------------

             Summary: Enhance the console sink to provide watermark and state 
information
                 Key: SPARK-47362
                 URL: https://issues.apache.org/jira/browse/SPARK-47362
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 4.0.0
            Reporter: Neil Ramaswamy


As [discussed in the dev mailing 
list|https://lists.apache.org/thread/3wd23w0tldk3d8dbjznc11ybc1p3v3hh], we 
should enhance the console sink for Structured Streaming to additionally 
provide information about:
 * The stream's watermark at the end of the batch
 * The rows in state at the end of each batch

This will be enabled via an `option` on the sink. Since both of these additions 
are for stateful queries only, the option will not affect stateless queries. To 
make parsing the output easier, timestamps will be duration-rendered (i.e. "1 
second" instead of the ISO 8601 extened timestamp). For joins, just the 
KeyWithIndexToValue will be shown. If there are multiple stateful operators, 
we'll print a state table for each.

These considerations are up to discussion either in this thread or in the PR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to