davisp commented on a change in pull request #409: RFC for CouchDB background 
workers
URL: 
https://github.com/apache/couchdb-documentation/pull/409#discussion_r289975799
 
 

 ##########
 File path: rfcs/007-background-jobs.md
 ##########
 @@ -0,0 +1,350 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Background jobs with FoundationDB'
+labels: rfc, discussion
+assignees: ''
+
+---
+
+[NOTE]: # ( ^^ Provide a general summary of the RFC in the title above. ^^ )
+
+# Introduction
+
+This document describes a data model, implementation, and an API for running
+CouchDB background jobs with FoundationDB.
+
+## Abstract
+
+CouchDB background jobs are used for things like index building, replication
+and couch-peruser processing. We present a generalized model which allows
+creation, running, and monitoring of these jobs.
+
+The document starts with a description of the framework API in Erlang
+pseudo-code, then we show the data model, followed by the implementation
+details.
+
+## Requirements Language
+
+[NOTE]: # ( Do not alter the section below. Follow its instructions. )
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+"SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
+document are to be interpreted as described in
+[RFC 2119](https://www.rfc-editor.org/rfc/rfc2119.txt).
+
+## Terminology
+
+---
+
+`Job`: A unit of work, identified by a `JobId` and also having a `Type`.
+
+`Worker` : A language-specific execution unit that runs the job. Could be an
+Erlang process, a thread, or just a function.
+
+`Job table`: An FDB subspace holding the list of jobs.
+
+`Pending job`: A job that is waiting to run.
+
+`Pending queue` : A queue of pending jobs ordered by priority.
+
+`Running job`: A job which is currently executing. To be considered "running"
+the worker must periodically update the job's state in the global job table.
+
+`Priority`: A job's priority specifies its order in the pending queue. Priority
+can by any term that can be encoded as a key in the FoundationDB's tuple 
layer. The
+exact value of `Priority` is job type specific. It MAY be a rough timestamp, a
+`Sequence`, a list of tags, etc.
+
+`Job re-submission` : Re-submitting a job means putting a previously running
+job back into the pending queue.
+
+`Activity monitor` : Functionality implemented by the framework which checks
+job liveness (activity). If workers don't update their status often enough,
+activity monitor will re-enqueue their jobs as pending. This ensures jobs make
+progress even if some workers terminate unexpectedly.
+
+`JobState`: Describes the current state of the job. The possible values are
+`"running"`, `"pending"`, and `"finished"`. These are the minimal number of
+states needed to describe a job's behavior in respect to this framework. Each
+job type MAY have additional, type specific states, such as `"failed`",
+`"error"`, `"retrying"`, etc.
+
+`Sequence`: a 13 byte value formed by combining the current `Incarnation` of
+the database and the `Versionstamp` of the transaction. Sequences are
+monotonically increasing even when a database is relocated across FoundationDB
+clusters. See (RFC002) for a full explanation.
+
+---
+
+# Framework API
+
+This section describes the job creation and worker implementation APIs. It 
doesn't
+describe how the framework is implemented. The intended audience is CouchDB
+developers using this framework to implement background jobs for indexing,
+replication, and couch-peruser.
+
+Both the job creation and the worker implementation APIs use a `JobOpts` map to
+represent a job. It MAY also contain these top level fields:
+
+  * `"priority"` : The value of this field will contain the `Priority` value of
+    the job. `Priority` is job-type specific.
+  * `"data"`: An opaque object (map), from the framework's point of view,
+    containing job-type specific data. It MAY contain an update sequence, or an
+    error message, for example.
+  * `"cancel"` : Boolean field defaulting to `false`. If `true` indicates the
+    user intends to stop a job's execution.
+  * `"resubmit"` : Boolean field defaulting to `false`. If `true` indicates
+    the job should be re-submitted.
+
+### Job Creation API ###
+
+```
+add(Type, JobId, JobOpts) -> ok | {error, Error}
+```
+ - Add a job to be executed by a background worker.
+
+```
+remove(Type, JobId) -> ok | not_found
+```
+ - Remove a job. If it is running, it will be stopped, then it will be removed
+   from the job table.
+
+```
+resubmit(Type, JobId) -> ok | not_found
+```
+ - Indicates that the job should be re-submitted for execution.
+
+```
+get_job(Type, JobId) -> {ok, JobOpts, JobState}
+```
+ - Return `JobOpts` and the `JobState`. `JobState` value MAY be:
+  * `"pending"` : This job is pending.
+  * `"running"` : This job is currently running.
+  * `"finished"` : This job has finished running and is not pending.
+
+### Worker Implementation API
+
+This API is to be used when implementing workers for various job types. The 
general pattern
+is to call `accept()` from something like a job manager, then for each accepted
+job spawn a worker to execute it, and then resume calling `accept()` to get
+other jobs. When a job is running, the worker MUST periodically call `update()`
+to prevent the activity monitor from re-enqueueing it. When the worker decides 
to stop
+running a job, they MUST call `finish()` to indicate that the job has finished 
running.
+
+```
+accept(Type[, MaxPriority]) -> {ok, JobId, WorkerLockId} | not_found
+```
 
 Review comment:
   Its not entirely clear to me how many simultaneous calls to `accept` should 
be made. There's a mention of a `job manager` but is that required or just a 
hint at a specific implementation? Is it possible to just spawn `N` consumers 
that all call `accept` independently?
   
   Also does `accept` block? When is `not_found` returned? What should happen 
if `not_found` is returned? Do we sleep for a bit and try again? Is that an 
unrecoverable error like "I have no idea what `Type` you're talking about." 
sort of response?
   
   Also, same for `get_job` it seems like it would be better to return `{ok, 
Job}` here to make client code more ergonomic so we're only carrying around one 
variable instead of the possible 5 that `finish` takes as arguments.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to