je-ik commented on pull request #13592:
URL: https://github.com/apache/beam/pull/13592#issuecomment-749754120


   > I verified that the reader is reused from cache in Kafka case manually.
   
   Hm, are you sure? That confuses me, because looking at the code, I'm not 
sure how that could function with `System.identityHashCode` as hashCode for 
CheckpointMark. Provided that AutoValue delegates its hashCode as would be 
expected and that guava's Cache uses hashCode. Hm, maybe Dataflow is not using 
SplittableDoFnViaKeyedWorkItems and has some specific implementation? 
   
   > It makes me feel like configuring split frequency from PipelineOption
   
   Sure, Flink has such an option. It would be natural to either create one for 
generic use, or add that to respective runner's PipelineOptions.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to