nbali edited a comment on pull request #15951:
URL: https://github.com/apache/beam/pull/15951#issuecomment-1043793735


   @lukecwik 
   > That makes a lot of sense. An alternative would be to add support for stop 
read time to KafkaUnboundedReader.
   > 
   > This translation seems like it will always be brittle in that some 
feature/option won't be supported in KafkaUnboundedReader and people will 
forget to update it here and then a future person will go down this rabbit hole 
again.
   
   PR is coming soon to detect any misuse/lost functionality of `KafkaIO.Read` 
with fail-fast behaviour at pipeline creation that also enforces developers to 
add their newly introduced properties to that detection :)
   
   @kennknowles 
   > @nbali it can be a bit confusing, but the `isStreaming` pipeline option is 
not actually part of the core Beam model. It is a runner-specific option. Spark 
and Dataflow have separate batch/streaming modes. The direct runner and Flink 
runner don't need this. Really "streaming" is the universal mode that works for 
everything, while batch is a special case that allows optimizations because all 
the data is bounded (so we can do more splitting up front, and don't need to 
checkpoint and pause, etc).
   
   I'm completely aware that my knowledge apart from the direct and the 
dataflow runners are minimal as I have only used those, and I doubt it's going 
to change. The logic in Spark seemed similar, but it might be actually 
required. My issue was that with Dataflow it isn't, but let's continue this 
discussion at my PR aimed at addressing that issue at 
https://github.com/apache/beam/pull/16773
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to