Github user davisp commented on the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#issuecomment-220387961
  
    For background on this, "recompaction" is just an internal name for the 
step during view compaction when we spawn an index updater to top off a 
compacted view. This has been in CouchDB since almost forever but was never 
used by Cloudant.
    
    The reason we have the view updater is that (unlike DB compactions) we 
can't resume a view compaction. So we have to make sure that when we swap out 
the compaction that there's no need to retry compaction.
    
    The downside to not running the recompact step is that a view will appear 
to roll backwards when a view compaction finishes. In a cluster this isn't a 
big deal because there are two other copies that will cover while the index 
updater catches back up.
    
    The downside to running a recompact is that if the update frequency on a 
database is too high we'll never converge on view compactions. Which makes 
obvious sense if you think about it. Any updates that happen during the initial 
compaction stage need to be re-processed by the index updater. If you have two 
index updaters running over the same data (ie, one updating the uncompacted 
index, one trying to finish updates on the compacted index) they should be 
running at the same rate and will thus never converge.
    
    To @rnewson's comment, this is the appropriate place. Its not a question of 
whether/when compaction runs, but whether we perform the index updater step 
which we have long known falls down on large views and/or high update rates.
    
    For the config layout, I'd propose a combined approach between mine and 
@kxepal's suggestions. I do like the booleans but I think the 
`[view_compaction.recompact] with `$dbname` and `$dbname:$view` is more 
appropriate. The specific sections with a single pre-defined key seem like a 
misuse of the `section/key/value` namespace config hierarchy.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to