[jira] [Comment Edited] (BEAM-1126) Expose UnboundedSource split backlog in number of events

Davor Bonaci (JIRA) Sun, 11 Dec 2016 13:11:38 -0800

    [ 
https://issues.apache.org/jira/browse/BEAM-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15740357#comment-15740357
 ]


Davor Bonaci edited comment on BEAM-1126 at 12/11/16 9:10 PM:
--------------------------------------------------------------

Interesting perspective, thanks [~aviemzur].

I think the primary design goal of the current API was to enable dynamic 
optimizations, as opposed to monitoring scenarios. The general idea was that 
the source should provide an indication of the amount of pending work, and it 
was probably thought that the size in bytes better correlates to "work" than 
the size in terms of number of elements. Basically, it was intended that the 
consumer of the data is the runner, not the user.

That said, monitoring scenarios are possibly even more important. I think the 
idea there was that the source should publish monitoring metrics directly 
through Beam abstractions in a runner-independent way. Then, all runners would 
get this benefit, with no particular work required, in a metric that makes 
sense for that source. (However, I don't think a source can do this today -- 
but this could a different approach for the same problem.)

Anyways, I'm sure [~dhalp...@google.com] will comment more ;-)


was (Author: davor):
Interesting perspective, thanks [~aviemzur].

I think the primary design goal of the current API was to enable dynamic 
optimizations, as opposed to monitoring scenarios. The general idea was that 
the source should provide an indication of amount of pending work, and it was 
probably thought that the size in bytes better correlates to "work" than the 
size in terms of number of elements. Basically, it was intended that the 
consumer of the data is the runner, not the user.

That said, monitoring scenarios are possibly even more important. I think the 
idea there was that the source should publish monitoring metrics directly 
thought Beam abstractions in a runner-independent way. Then, all runners would 
get this benefit, with no particular work required, in a metric that makes 
sense for that source. (However, I don't think a source can do this today -- 
but this could a different approach for the same problem.)

Anyways, I'm sure [~dhalp...@google.com] will comment more ;-)

> Expose UnboundedSource split backlog in number of events
> --------------------------------------------------------
>
>                 Key: BEAM-1126
>                 URL: https://issues.apache.org/jira/browse/BEAM-1126
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Aviem Zur
>            Assignee: Daniel Halperin
>            Priority: Minor
>
> Today {{UnboundedSource}} exposes split backlog in bytes via 
> {{getSplitBacklogBytes()}}
> There is value in exposing backlog in number of events as well, since this 
> number can be more human comprehensible than bytes. something like 
> {{getSplitBacklogEvents()}} or {{getSplitBacklogCount()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (BEAM-1126) Expose UnboundedSource split backlog in number of events

Reply via email to