Hi Nick, sorry for the late reply here. I'm very eager to see this
land, and set the stage for the same thing to occur in the reverse
direction. Everything below is considering this future potential,
as well as a couple of operational considerations.

> From: "Nick Vatamaniuc" <vatam...@apache.org>


> * `GET /_shard_splits`

As a result I'm concerned: would we then have duplicate endpoints
for /_shard_merges? Or would a unified /_reshard endpoint make
more sense here?

I presume that if you've disabled shard manipulation on the
cluster, the status changes to "disabled" and the value is the
reason provided by the operator?

> Get a summary of shard splitting for the whole cluster.

What happens if every node in the cluster is restarted while a shard
split operation is occurring? Is the job persisted somewhere, i.e. in
special docs in _dbs, or would this kill the entire operation? I'm
considering rolling cluster upgrades here.


> * `PUT /_shard_splits`

Same comment as above about whether this is /_shard_splits or something
that could expand to shard merging in the future as well.

If you persist the state of the shard splitting operation when disabling,
this could be used as a prerequisite to a rolling cluster upgrade
(i.e., an important documentation update).


> * `POST /_shard_splits/jobs`
> 
> Start a shard splitting job.
> 
> Request body:
> 
> {
>     "node": "dbc...@db1.sandbox001.cloudant.net",
>     "shard": "shards/00000000-FFFFFFFF/username/dbname.$timestamp"
> }


1. Agree with earlier comments that having to specify this per-node is
a nice to have, but really an end user wants to specify a *database*,
and have the API create the q*n jobs needed. It would then return an
array of jobs in the format you describe.

2. Same comment as above; why not add a new field for "type":"split" or
"merge" to make this expandable in the future?

-Joan

Reply via email to