Hi Joan, Thank you for taking a look!
> > * `GET /_shard_splits` > > As a result I'm concerned: would we then have duplicate endpoints > for /_shard_merges? Or would a unified /_reshard endpoint make > more sense here? > Good idea. Let's go with _reshard it's more general and allows for adding shard merging later. > I presume that if you've disabled shard manipulation on the > cluster, the status changes to "disabled" and the value is the > reason provided by the operator? > > Currently it's PUT /_reshard/state and body {"state":"running":"stopped", "reason":...}. This will be shown at the top level in GET /_reshard/ response. > > Get a summary of shard splitting for the whole cluster. > > What happens if every node in the cluster is restarted while a shard > split operation is occurring? Is the job persisted somewhere, i.e. in > special docs in _dbs, or would this kill the entire operation? I'm > considering rolling cluster upgrades here. > > The job will checkpoint as it goes through various steps that is saved in a _local document in the shards dbs. So if a node is restarted, the job will resume from the last checkpoint it stopped at > > > * `PUT /_shard_splits` > > Same comment as above about whether this is /_shard_splits or something > that could expand to shard merging in the future as well. > > If you persist the state of the shard splitting operation when disabling, > this could be used as a prerequisite to a rolling cluster upgrade > (i.e., an important documentation update). > > I think after discussing with other participants this became PUT /_shard_splits/state (now PUT _reshard/state). The disable state is also persisted on a per-node basis. An interesting thing to think about, is if a node is down when shard splitting is stopped or started, it won't find out about it. So I think we might have to do some kind querying of neighboring nodes to detect if a new node that just joined had missed a recent change to the global state. > > * `POST /_shard_splits/jobs` > > > > Start a shard splitting job. > > > > Request body: > > > > { > > "node": "dbc...@db1.sandbox001.cloudant.net", > > "shard": "shards/00000000-FFFFFFFF/username/dbname.$timestamp" > > } > > > 1. Agree with earlier comments that having to specify this per-node is > a nice to have, but really an end user wants to specify a *database*, > and have the API create the q*n jobs needed. It would then return an > array of jobs in the format you describe. > Ok, I think that's doable if we switch the response to be an array of job_ids. Then we might also have to think about various failure modes, such as what if the one of the nodes where a copy lives, is not up. Should that be a failure or do we continue splitting just 2 copies. > > 2. Same comment as above; why not add a new field for "type":"split" or > "merge" to make this expandable in the future? > > That makes sense, I can add a type field if we have _reshard as the top level endpoint. > -Joan >