[ 
https://issues.apache.org/jira/browse/BEAM-11403?focusedWorklogId=527335&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-527335
 ]

ASF GitHub Bot logged work on BEAM-11403:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Dec/20 19:50
            Start Date: 22/Dec/20 19:50
    Worklog Time Spent: 10m 
      Work Description: je-ik commented on pull request #13592:
URL: https://github.com/apache/beam/pull/13592#issuecomment-749744489


   > btw I tested it by using Kafka + Dataflow streaming and I don't notice a 
performance improvement there. It might be that the cost of creating Kafka 
connection is really cheap.
   
   That doesn't seem to be a surpise, because under the current implementation, 
it is essential for CheckpointMark to correctly implement equals and hashCode 
(which KafkaCheckpointMark does not), because between two successive calls to 
`processElement` the checkpoint is stored in state and therefore serialized and 
deserialized and so a new object is put into the cache. Second point is that, 
even after we fix this, it will be probably noticeable only on pipelines with 
very frequent checkpoints.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 527335)
    Time Spent: 1h 50m  (was: 1h 40m)

> Unbounded SDF wrapper causes performance regression on DirectRunner
> -------------------------------------------------------------------
>
>                 Key: BEAM-11403
>                 URL: https://issues.apache.org/jira/browse/BEAM-11403
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-direct, sdk-java-core
>            Reporter: Boyuan Zhang
>            Assignee: Boyuan Zhang
>            Priority: P2
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> There is a significant performance regression when switching from 
> UnboundedSource to Unbounded SDF wrapper. So far there are 2 IOs reported:
> * Pubsub Read: 
> https://lists.apache.org/thread.html/re6b0941a8b4951293a0327ce9b25e607cafd6e45b69783f65290edee%40%3Cdev.beam.apache.org%3E
> * Kafka Read: https://the-asf.slack.com/archives/C9H0YNP3P/p1606155042346600



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to