nickva commented on a change in pull request #409: RFC for CouchDB background workers URL: https://github.com/apache/couchdb-documentation/pull/409#discussion_r281635288
########## File path: rfcs/007-background-workers.md ########## @@ -0,0 +1,252 @@ +--- +name: Formal RFC +about: Submit a formal Request For Comments for consideration by the team. +title: 'Background workers with FoundationDB backend' +labels: rfc, discussion +assignees: '' + +--- + +[NOTE]: # ( ^^ Provide a general summary of the RFC in the title above. ^^ ) + +# Introduction + +This document describes a data model and behavior of CouchDB background workers. + +## Abstract + +CouchDB background workers are used for things like index building and +replication. We present a generalized model that allows creation, running, and +monitoring of these jobs. "Jobs" are represented generically such that both +replication and indexing could take advantage of the same framework. The basic +idea is that of a global job queue for each job type. New jobs are inserted +into the jobs table and enqueued for execution. + +There are a number of workers that attempt to dequeue pending jobs and run +them. "Running" is specific to each job type and would be different for +replication and indexing. Workers are processes which execute jobs. The MAY be +individual Erlang processes, but could also be implemented in Python, Java or +any other environment with a FoundationDB client. The only coordination between +workers happens via the database. Workers can start and stop at any time. +Workers monitor each other for liveliness and in case some workers abruptly +terminate, all the jobs of a dead worker are re-enqueued into the global +pending queue. + +## Requirements Language + +[NOTE]: # ( Do not alter the section below. Follow its instructions. ) + +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", +"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this +document are to be interpreted as described in +[RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt). + +## Terminology + +--- + +`Job`: An unit of work, identified by a `JobId` and also having a `JobType`. + +`Job table`: A subspace holding the list of jobs indexed by `JobId`. + +`Pending queue`: A queue of jobs which are ready to run. Workers may pick jobs +from this queue and start running them. + +`Active jobs`: A subspace holding the list of jobs currently run by a +particular worker. + +`Worker`: A job execution unit. Workers could be individual processes or groups +of processes running on remote nodes. + +`Health key` : A key that the worker periodically updates with a timestamp to +indicate that they are "alive" and ready to process jobs. + +`Versionstamp`: a 12 byte, unique, monotonically (but not sequentially) +increasing value for each committed transaction. + +--- + +# Detailed Description + +## Data Model + +The main job table: + - `("couch_workers", "jobs", JobType, JobId) = (JobState, WorkerId, Priority, CancelReq, JobInfo, JobOps)` + +Pending queue: + - `("couch_workers", "pending", JobType, Priority, JobId) = ""` + +Active queue: Review comment: You're right. Thanks, it's just table in this case. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
