davisp opened a new pull request #3127:
URL: https://github.com/apache/couchdb/pull/3127


   ## Overview
   
   Replace the couch_rate rate limiting algorithm with couch_views_batch to 
dynamically optimize the indexer batch size during view updates.
   
   While working on optimizing couch_views I ran into couch_rate behaving 
oddly. Reading through the implementation it was nearly impossible to predict 
how changing various parameters would affect the behavior under load. The rate 
limiting algorithms were also a bit misapplied. We don't really want a rate 
"limiter" so much as a rate "maximizer". 
   
   After failing to comprehend couch_rate and realizing that it was missing 
some fairly important signals (specifically, the approximate transaction size) 
I decided to take a whack at simplifying things. The new approach is quite a 
bit simpler and results in the ability to both successfully index large views 
(1,000,000 doc ranges were tested) but also has a more predictable and simple 
behavior.
   
   
![batch_size_search](https://user-images.githubusercontent.com/19929/92163250-f6697480-edf8-11ea-9553-9da2be9f1152.png)
   
   The graph above is a representative of the behavior. Batch sizes start out 
small when a view indexer starts. The batch size is quickly ramped up to search 
for the threshold of a batch size, and then batch sizes are adjusted in much 
smaller increments to follow the optimal batch size.
   
   Matching the old behavior of couch_rate to attempt to maintain a consistent 
transaction time limit can be accomplished trivially by adjusting the 
`couch_views.batch_tx_max_time` parameter.
   
   ## Testing recommendations
   
   `make check`
   
   ## Checklist
   
   - [x] Code is written and works correctly
   - [x] Changes are covered by tests
   - [x] Any new configurable parameters are documented in 
`rel/overlay/etc/default.ini`
   - [ ] A PR for documentation changes has been made in 
https://github.com/apache/couchdb-documentation
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to