boyuanzz commented on pull request #11715: URL: https://github.com/apache/beam/pull/11715#issuecomment-630952259
> Should we be using the RangeEndEstimator when providing progress/splitting for ranges not ending at `Long.MAX_VALUE`? > > Lets say the range estimate is bad and is `MAX_VALUE - 3` but the real end is `5000`, then after a split we end up with `[0, (MAX_VALUE - 3) * 0.5)` and `[(MAX_VALUE - 3) * 0.5, MAX_VALUE)`. We may quickly learn that the residual is empty and then lose all effective progress on the primary. I can see the benefit of using `RangeEndEstimator` for the finite range here. But as long as we don't modify the range end to estimate end or use estimate ed in `tryClaim`, we still cannot say the residual is empty. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
