Dmytro Fedoriaka created SPARK-54122:
----------------------------------------

             Summary: Improve the testing experience for TransformWithState
                 Key: SPARK-54122
                 URL: https://issues.apache.org/jira/browse/SPARK-54122
             Project: Spark
          Issue Type: New Feature
          Components: Structured Streaming
    Affects Versions: 4.1.0
            Reporter: Dmytro Fedoriaka


Currently it's hard to write unit tests for user's implementation of 
StatefulProcessor used in TransformWithState. User either needs to test it by 
running actual streaming query, or they need to refactor business logic into a 
separate function (that will be called from handleInputRows), and write unit 
tests for that function.
 
I propose to add a simple testing framework for TransformWithState that will 
allow users easily test their business logic inside StatefulProcessor.
 
On high-level, it's a class TwsTester that takes StatefulProcessor and allows 
to feed in input rows and immediately returns what rows would be produced by 
TWS. It also allows to set and inspect state. This can be used in unit tests 
without having to run streaming query and it won't need RocksDB (it will use 
in-memory state store). I will start with implementing this for Scala users, 
potentially making it available for Python users later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to