nickva commented on issue #4448:
URL: https://github.com/apache/couchdb/issues/4448#issuecomment-1707183434

   To avoid managing a new data state directory (config option, checking for 
disk space, handling read-only or other errors), it would be a lot simpler to 
manage the checkpoints as simple _local docs in _dbs. There is precedent for 
using that exact mechanism for shard splitting job management.
   
   The general idea is that there is a general mechanism to traverse databases 
and ddocs. Periodically it will update its checkpoint in 
`_dbs/_local/scanner_checkpoint` doc with the current db and ddoc (and some job 
id, start time, initial settings, and a few other bits if needed). 
   
   Then, as the traversal happens, a call is made to each of the configured 
callback modules(`scanner_quickjs_compact_check`, 'scanner_size_stats`, ...) 
with the db and ddoc in turn. After all the callbacks are called the context 
for each individual module will be updated. It can be a simple map (json 
object). Then each module can do its own processing: write to some database, 
write to disk, log a report, etc.
   
   The configuration could look something like:
   
   ```ini
   [scanner]
   enable = true | false
   schedule_period = once | every_week | every_day | ...
   
   [scanner.$module] ...
   enable = true | false
   ... $module specific settings ...
   ```
   
   For instance:
   
   ```ini
   [scanner.quickjs_compat_check]
   enable = true | false
   sample_docs = 100
   check_reduce = true | false
   log_report_level = warning
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to