nickva commented on a change in pull request #1972: Introduce Shard Splitting 
To CouchDB
URL: https://github.com/apache/couchdb/pull/1972#discussion_r268806817
 
 

 ##########
 File path: src/mem3/README_reshard.md
 ##########
 @@ -0,0 +1,93 @@
+Developer Oriented Resharding Description
+=========================================
+
+This is a technical description of the resharding logic. The discussion will 
focus on: job creation and life-cycle, data definitions, and on the state 
transition mechanisms.
+
+
+Job Life-Cycle
+--------------
+
+Job creation happens in the `mem3_reshard_httpd` API handler module. That 
module uses `mem3_reshard_http_util` to do some validation, and eventually 
calls `mem3_reshard:start_split_job/2` on one or more nodes in the cluster 
depending where the new jobs should run.
+
+`mem3_reshard:start_split_job/2` is the main Erlang API entry point.  After 
some more validation it creates a `#job{}` record and calls the `mem3_reshard` 
job manager. The manager then will add the job to its jobs ets table, save it 
to a `_local` document in the shards db, and most importantly, start a new 
resharding process. That process will be supervised by the 
`mem3_reshard_job_sup` supervisor.
+
+Each job will be running in a gen_server implemented in `mem3_reshard_job` 
module. When splitting a shard, a job will go through a series of steps such as 
`initial_copy`, `build_indices`, `update_shard_map`, etc. Between each step it 
will report progress and checkpoint with `mem3_reshard` manager. A checkpoint 
involved the `mem3_reshard` manager persisting that job's state to disk in 
`_local` document in `_dbs` db. Then job continues until `completed` state or 
until it failed in the `failed` state.
+
+If a user stops shard splitting on the whole cluster, then all running jobs 
will stop. When shard splitting is resumed, they will try to recover from their 
last checkpoint.
+
+A job can also be individually stopped or resumed. If a job is individually 
stopped it will stay so even if the global shard splitting state is `running`. 
A user has to explicitly set that job to a `running` state for it to resume. If 
a node with running jobs is turned off, when it is rebooted running jobs will 
resume from their last checkpoint.
+
+
+Data Definitions
+----------------
+
+This section focuses on record definition and how data is transformed to and 
from various formats.
+
+Right below the `mem3_reshard:start_split_job/1` API level a job is converted 
to a `#job{}` record defined in the `mem3_reshard.hrl` header file. That record 
is then used throughout most of the resharding code. The job manager 
`mem3_reshard` stores it in its jobs ets table, then when a job process is 
spawn it single argument also just a `#job{}` record. As a job process is 
executing it will periodically report state back to the `mem3_reshard` manager 
as an updated `#job{}` record.
+
+Some interesting fields from the `#job{}` record:
+
+ - `id` Uniquely identifies a job in a cluster. It is derived from the source 
shard name, node and a version (currently = 1).
+ - `type` Currently the only type supported is `split` but `merge` or 
`rebalance` might be added in the future.
+ - `job_state` The running state of the job. Indicates if the job is 
`running`, `stopped`, `completed` or `failed`.
+ - `split_state` Once the job is running this indicates how far along it got 
in the splitting process.
+ - `source` Source shard file. If/when merge is implemented this will be a 
list.
+ - `target` List of target shard files. This is expected to be a list of 2 
times currently.
 
 Review comment:
   Thanks!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to