[GitHub] [couchdb-documentation] nickva commented on a change in pull request #409: RFC for CouchDB background workers

GitBox Wed, 08 May 2019 11:20:18 -0700

nickva commented on a change in pull request #409: RFC for CouchDB background 
workers
URL: 
https://github.com/apache/couchdb-documentation/pull/409#discussion_r282186289


 ##########
 File path: rfcs/007-background-workers.md
 ##########
 @@ -0,0 +1,299 @@
+---
+name: Formal RFC
+about: Submit a formal Request For Comments for consideration by the team.
+title: 'Background workers with FoundationDB backend'
+labels: rfc, discussion
+assignees: ''
+
+---
+
+[NOTE]: # ( ^^ Provide a general summary of the RFC in the title above. ^^ )
+
+# Introduction
+
+This document describes a data model and behavior of CouchDB background 
workers.
+
+## Abstract
+
+CouchDB background workers are used for things like index building and
+replication. We present a generalized model that allows creation, running, and
+monitoring of these jobs. "Jobs" are represented generically such that both
+replication and indexing could take advantage of the same framework. The basic
+idea is that of a global job queue for each job type. New jobs are inserted
+into the jobs table and enqueued for execution.
+
+There are a number of workers that attempt to dequeue pending jobs and run
+them. "Running" is specific to each job type and would be different for
+replication and indexing, respectively.
+
+Workers are processes which execute jobs. They MAY be individual Erlang
+processes, but could also be implemented in Python, Java or any other
+environment with a FoundationDB client. The only coordination between workers
+happens via the database. Workers can start and stop at any time. Workers
+monitor each other for liveliness and in case some workers abruptly terminate,
+all the jobs of a dead worker are re-enqueued into the global pending queue.
 
 Review comment:
   It's job type specific. Whether to re-enqueue the job to retry on error 
(with a backoff penalty) would be dependent on the type. This framework would 
only know if the job is running somewhere, is in a pending queue waiting to 
run, or has completed and will not be running anymore. 
   
   The reason for the job being in pending state could be a retry or could be 
because it is a new job. The reason for completion could be a successful 
completion or failure.
   
   I am ok with adding those extra states if it would generally work for 
indexing. It would work for replication (for reference 
https://docs.couchdb.org/en/stable/replication/replicator.html#replication-states)
   
   So it could be:
   
     * "pending" : waiting to run in a queue
     * "running" : running (in a worker)
     * "completed" : successfully completed
     * "failed": permanent failure, job should not be re-inserted in the 
pending queue, user should delete it and re-create it
     * "error" (this also maps to "crashing" in the replicator case): An error 
has occurred but the job is pending for a retry. 
   
   Should job retries and backoff schedule also be part of the jobs framework 
or each worker type should define its own policy. (ex. replication jobs use a 
doubling on each failed job start, with a max about 8 hours, what do indexing 
jobs do?)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [couchdb-documentation] nickva commented on a change in pull request #409: RFC for CouchDB background workers

Reply via email to