nickva opened a new pull request, #5014: URL: https://github.com/apache/couchdb/pull/5014
[WIP] Everything is place except docs and tests The app scans all the dbs and docs. It has a plugin system to allow gathering various things from a cluster. The first use is to scan all the javascript design docs and run them through the new QuickJS javascript engine. Other possible uses: - Gather total db and view sizes - Scan for document features (docs of certain sizes, contained certain fields and values). The plugins are managed as individual process by the couch_scanner_server with the start_link/1 and stop/1 functions. After a plugin runner is spawned, the only thing couch_scanner_server does is wait for it to exit. The plugin runner process may exit normally, crash, or exit with {shutdown, {reschedule, TSec}} if they want to reschedule to run again at some point the future (next day, a week later, etc). After the process starts, it will load and validate the plugin module. Then, it will start scanning all the dbs and docs on the local node. Shard ranges will be scanned only on one of the cluster nodes to avoid duplicating work. For instance, if there are 2 shard ranges, 0-7, 8-f, with copies on nodes n1, n2, n3. Then, 0-7 might be scanned on n1 only, and 8-f on n3. The plugin API is the following (as OTP callback definitions): ```erlang -callback start(ScanId :: binary(), EJson :: #{}) -> {ok, St :: term()} | skip. -callback resume(ScanId :: binary(), EJson :: #{}) -> {ok, St :: term()} | skip. -callback stop(St :: term()) -> {ok, EJson :: #{}}. -callback checkpoint(St :: term()) -> {ok, EJson :: #{}}. -callback db(St :: term(), DbName :: binary()) -> {ok | skip | stop, St1 :: term()}. -callback ddoc(St :: term(), DbName :: binary(), #doc{}) -> {ok | stop, St1 :: term()}. -callback shards(St :: term(), [#shard{}]) -> {[#shard{}], St1 :: term()}. -callback db_opened(St :: term(), Db :: term()) -> {ok, St :: term()}. -callback doc_id(St :: term(), DocId :: binary(), Db :: term()) -> {ok | skip | stop, St1 :: term()}. -callback doc(St :: term(), Db :: term(), #doc{}) -> {ok | stop, St1 :: term()}. -callback db_closing(St :: term(), Db :: term()) -> {ok, St1 :: term()}. ``` A simple plugin `couch_scanner_plugin_ddoc_features` is included as first example implementation. It traverses the design docs on a cluster and reports when it finds Apache CouchDB 4.x deprecated features (lists, shows, etc). Plugin module are enabled by `$plugin_mod = true` entries in the `[couch_scanner_plugins]` section. For example, to enable `couch_scanner_plugin_ddoc_features`: ``` [couch_scanner_plugins] couch_scanner_plugin_ddoc_features = true ``` Plugins may configure their scheduling using `after` and `repeat` config values. For example, to start after Unix time stamp 1711249693 and then run every 3 days: ``` [couch_scanner_plugin_ddoc_features] after = 1711249693 repeat = 3_days ``` The default values for `after` and `repeat` is ` = restart`, meaning to run once after the node starts up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@couchdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org