davisp commented on a change in pull request #470: Scheduling Replicator URL: https://github.com/apache/couchdb/pull/470#discussion_r110439034
########## File path: src/couch_replicator/src/couch_replicator_rate_limiter.erl ########## @@ -0,0 +1,274 @@ +% Licensed under the Apache License, Version 2.0 (the "License"); you may not +% use this file except in compliance with the License. You may obtain a copy of +% the License at +% +% http://www.apache.org/licenses/LICENSE-2.0 +% +% Unless required by applicable law or agreed to in writing, software +% distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +% WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +% License for the specific language governing permissions and limitations under +% the License. + + +% This module implements rate limiting based on a variation the additive +% increase / multiplicative decrease feedback control algorithm. +% +% https://en.wikipedia.org/wiki/Additive_increase/multiplicative_decrease +% +% This is an adaptive algorithm which converges on available channel +% capacity where each participant (client) doesn't a priori know the +% capacity, and participants don't communicate or know about each other (so they +% don't coordinate to divide the capacity among themselves). +% +% The algorithm referenced above estimates a rate, whereas the implemented +% algorithm uses an interval (in milliseconds). It preserves the original +% semantics, that is the failure part is multplicative and the success part is +% additive. The relationship between rate and interval is: rate = 1000 / interval. +% +% There are two main API functions: +% +% success(Key) -> IntervalInMilliseconds +% failure(Key) -> IntervalInMilliseconds +% +% Key is any term, typically something like {Method, Url}. The result from the +% function is the current period value. Caller then might decide to sleep for +% that amount of time before or after each request. + + +-module(couch_replicator_rate_limiter). +-behaviour(gen_server). + + +% public API + +-export([start_link/0]). +-export([interval/1, max_interval/0, failure/1, success/1]). + + +% gen_server callbacks + +-export([init/1, handle_call/3, handle_info/2, handle_cast/2, + code_change/3, terminate/2]). + + +% Types + +-type key() :: any(). +-type interval() :: non_neg_integer(). +-type msec() :: non_neg_integer(). + + +% Definitions + +-define(SHARDS_N, 16). Review comment: Ah, I also wanted to ask why we're jumping straight to sharded ets tables? Is this something that we found to be a bottleneck or something that we're worried will be a bottleneck. Also, regardless, we should probably pull this out into its own module somewhere so that the logic is separate. Also, there was a neat generational ets cache I saw somewhere if we're worried about deleting things. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
