je-ik commented on pull request #13592:
URL: https://github.com/apache/beam/pull/13592#issuecomment-749744489


   > btw I tested it by using Kafka + Dataflow streaming and I don't notice a 
performance improvement there. It might be that the cost of creating Kafka 
connection is really cheap.
   
   That doesn't seem to be a surpise, because under the current implementation, 
it is essential for CheckpointMark to correctly implement equals and hashCode 
(which KafkaCheckpointMark does not), because between two successive calls to 
`processElement` the checkpoint is stored in state and therefore serialized and 
deserialized and so a new object is put into the cache. Second point is that, 
even after we fix this, it will be probably noticeable only on pipelines with 
very frequent checkpoints.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to