Re: [I] [Bug]: Prism not handling on_timer callbacks correctly when split happens [beam]

via GitHub Fri, 01 Aug 2025 21:34:39 -0700


lostluck commented on issue #35771:
URL: https://github.com/apache/beam/issues/35771#issuecomment-3146206565


   There's definitely a bug in Prism's split return handling. That should be 
fixed anyway.
   
   As for whether python should be fixed, i say yes, as it makes the SDK more 
robust to correct implementations of the model, vs just what Dataflow does.
   
   WRT the model, timers for different keys can be sent to SDKs by a runner 
within the same bundle. So Prism's implementation of timer bundles is correct.
   
   In practice, Prism is the only one that does this, as Dataflow and Flink 
(and probably the others) simply never mix keys in bundles. Similarly, they 
never split such bundles since they only have one element in them. This is also 
a correct implementation choice in the model. (And akin to the 2 and 3 
solutions there). It's important to recognize that single element/key bundle 
approach is largely a performance based choice to reduce latency across 
independent keys.
   
   My view on this is that the SDK has a bug that needs to be fixed, as it's 
not robust to a correct runner implementation.
   
   Ideally, we run the same test pipeline implemented in Java and Go, and see 
what they do. If they have the same kind of bug, we should probably just change 
Prism's behavior then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [I] [Bug]: Prism not handling on_timer callbacks correctly when split happens [beam]

Reply via email to