Dmytro Fedoriaka created SPARK-54122:
----------------------------------------
Summary: Improve the testing experience for TransformWithState
Key: SPARK-54122
URL: https://issues.apache.org/jira/browse/SPARK-54122
Project: Spark
Issue Type: New Feature
Components: Structured Streaming
Affects Versions: 4.1.0
Reporter: Dmytro Fedoriaka
Currently it's hard to write unit tests for user's implementation of
StatefulProcessor used in TransformWithState. User either needs to test it by
running actual streaming query, or they need to refactor business logic into a
separate function (that will be called from handleInputRows), and write unit
tests for that function.
I propose to add a simple testing framework for TransformWithState that will
allow users easily test their business logic inside StatefulProcessor.
On high-level, it's a class TwsTester that takes StatefulProcessor and allows
to feed in input rows and immediately returns what rows would be produced by
TWS. It also allows to set and inspect state. This can be used in unit tests
without having to run streaming query and it won't need RocksDB (it will use
in-memory state store). I will start with implementing this for Scala users,
potentially making it available for Python users later.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]