Github user maasg commented on the issue:
https://github.com/apache/spark/pull/21194
@zsxwing Thanks for dropping by. This patch is about fixing the rate ramp
up when `rowsPerSecond <= rampUpTime`, which makes the Rate Source produce no
data until `rampUpTime` (See
[SPARK-24046](https://issues.apache.org/jira/browse/SPARK-24046)).
The review discussion in this PR is that, while fixing this issue, I
introduced a new way of calculating the `rampUp` that makes the previously
working scenario of `rowsPerSecond > rampUpTime` smoother and more consistent
(as shown in the charts above).
The original tests verified the ramp-up against some hard-coded values that
are changed by the new formula. While the semantics of the 'ramp up' behavior
are preserved, the intermediate ramp up values produced are different, which is
evidenced in the test.
I believe the overall code approach is an improvement over the original and
the behavior it shows is what we would expect from the description of the 'ramp
up' feature.
What do you think? Could you review the code changes?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]