nickva commented on issue #4448: URL: https://github.com/apache/couchdb/issues/4448#issuecomment-1707183434
To avoid managing a new data state directory (config option, checking for disk space, handling read-only or other errors), it would be a lot simpler to manage the checkpoints as simple _local docs in _dbs. There is precedent for using that exact mechanism for shard splitting job management. The general idea is that there is a general mechanism to traverse databases and ddocs. Periodically it will update its checkpoint in `_dbs/_local/scanner_checkpoint` doc with the current db and ddoc (and some job id, start time, initial settings, and a few other bits if needed). Then, as the traversal happens, a call is made to each of the configured callback modules(`scanner_quickjs_compact_check`, 'scanner_size_stats`, ...) with the db and ddoc in turn. After all the callbacks are called the context for each individual module will be updated. It can be a simple map (json object). Then each module can do its own processing: write to some database, write to disk, log a report, etc. The configuration could look something like: ```ini [scanner] enable = true | false schedule_period = once | every_week | every_day | ... [scanner.$module] ... enable = true | false ... $module specific settings ... ``` For instance: ```ini [scanner.quickjs_compat_check] enable = true | false sample_docs = 100 check_reduce = true | false log_report_level = warning ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
