[
https://issues.apache.org/jira/browse/BEAM-9977?focusedWorklogId=435073&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-435073
]
ASF GitHub Bot logged work on BEAM-9977:
----------------------------------------
Author: ASF GitHub Bot
Created on: 19/May/20 17:15
Start Date: 19/May/20 17:15
Worklog Time Spent: 10m
Work Description: lukecwik edited a comment on pull request #11715:
URL: https://github.com/apache/beam/pull/11715#issuecomment-630955102
> > Should we be using the RangeEndEstimator when providing
progress/splitting for ranges not ending at `Long.MAX_VALUE`?
> > Lets say the range estimate is bad and is `MAX_VALUE - 3` but the real
end is `5000`, then after a split we end up with `[0, (MAX_VALUE - 3) * 0.5)`
and `[(MAX_VALUE - 3) * 0.5, MAX_VALUE)`. We may quickly learn that the
residual is empty and then lose all effective progress on the primary.
>
> I can see the benefit of using `RangeEndEstimator` for the finite range
here. But as long as we don't modify the range end to estimate end or use
estimate ed in `tryClaim`, we still cannot say the residual is empty.
That is true but I was thinking it would make better splitting decisions
instead of creating a bunch of empty splits trimming the range down. The
advantage of not using the estimator is that we don't have to invoke since it
could be expensive for the user and in many situations will produce a value
greater than `to`.
We can leave it out for now unless some compelling use case comes up.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 435073)
Time Spent: 2h 20m (was: 2h 10m)
> Build Kafka Read on top of Java SplittableDoFn
> ----------------------------------------------
>
> Key: BEAM-9977
> URL: https://issues.apache.org/jira/browse/BEAM-9977
> Project: Beam
> Issue Type: New Feature
> Components: io-java-kafka
> Reporter: Boyuan Zhang
> Assignee: Boyuan Zhang
> Priority: P2
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)