GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/8199
[SPARK-9966][STREAMING] Handle couple of corner cases in PIDRateEstimator
1. The rate estimator should not estimate any rate when there are no
records in the batch, as there is no data to estimate the rate. In the current
state, it estimates and set the rate to zero. That is incorrect.
2. The rate estimator should not never set the rate to zero under any
circumstances. Otherwise the system will stop receiving data, and stop
generating useful estimates (see reason 1). So the fix is to define a
parameters that sets a lower bound on the estimated rate, so that the system
always receives some data.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tdas/spark SPARK-9966
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8199.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8199
----
commit 3a994dbaf5edea41d676c4215122bc634832daf4
Author: Tathagata Das <[email protected]>
Date: 2015-08-14T11:58:17Z
Added min rate and updated tests in PIDRateEstimator
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]