[ 
https://issues.apache.org/jira/browse/BEAM-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429415#comment-17429415
 ] 

Beam JIRA Bot commented on BEAM-10503:
--------------------------------------

This issue is assigned but has not received an update in 30 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> Document expectations around UnboundedReader advance interface
> --------------------------------------------------------------
>
>                 Key: BEAM-10503
>                 URL: https://issues.apache.org/jira/browse/BEAM-10503
>             Project: Beam
>          Issue Type: Task
>          Components: sdk-java-core
>            Reporter: Aaron Meihm
>            Assignee: David Huntsperger
>            Priority: P3
>              Labels: Clarified, P2, stale-assigned
>
> We have implemented some custom IO classes based on 
> UnboundedReader/UnboundedSource. These work as expected, but while doing this 
> I noticed a few things that didn't seem to be well documented and I'm not 
> sure if they behave as would be anticipated.
> With the direct runner, when advance returns false repeatedly it appears as 
> though direct runner will apply an increasing backoff to repeated calls to 
> advance until it returns true, at which point the backoff is reset. This 
> seems to be what I'd expect.
> However when the same code is used with Dataflow, advance will be called 
> multiple times a second for a single given UnboundedSource instance with no 
> backoff continuously. With more then one instance/worker this can start to 
> produce additional CPU load.
> I'm a bit unclear what the right way to do this is, for example should you 
> sleep in advance? I assume not, but it would be great if there was 
> documentation around this interface, especially around the differing behavior 
> of the various runners here and what the right way to implement this is to 
> ensure efficient resource usage when no events are available from the 
> underlying source.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to