Thank you, Jan, and everyone else who took a look! On Mon, Mar 25, 2019 at 5:56 AM Jan Lehnardt <j...@apache.org> wrote:
> Hi Nick, > > despite https://twitter.com/janl/status/1108741392975777795 left one typo > comment ;) > > Great job everyone! > > Jan > — > > > > On 21. Mar 2019, at 15:44, Jan Lehnardt <j...@apache.org> wrote: > > > > Hi Nick, > > > > On first glance, this looks all great and like an exemplary PR that is > easy to follow. And bonus props for the nice docs. I'll have more time for > a through review over the weekend. > > > > Cheers > > Jan > > — > > > >> On 18. Mar 2019, at 19:42, Nick Vatamaniuc <vatam...@gmail.com> wrote: > >> > >> Hello everyone, > >> > >> Thank you all (Joan, Jan, Mike, Ilya, Adam) who contributed to the API > >> discussion. There is now a PR open > >> https://github.com/apache/couchdb/pull/1972 . If you get a chance, I > would > >> appreciate any reviews, feedback or comments. > >> > >> The PR message explains how the commits are organized and references the > >> RFC. Basically it starts with preparatory work, ensuring all the > existing > >> components know how to deal with split shards. Then, some lower level > bits > >> are implemented, like bulk copy, internal replicator updates, etc., > >> followed by the individual job implementation and the job manager which > >> stitches everything together. In the end is the HTTP API implementation > >> along with a suite of unit and Elixir integration tests. > >> > >> There is also a README_reshard.md file in src/mem3 that tries to > provide a > >> more in-depth technical description of how everything fits together. > >> > https://github.com/apache/couchdb/pull/1972/files#diff-5ac7b51ec4e03e068bf271f34ecf88df > >> (notice > >> this URL might changer after a rebase). > >> > >> Also special thanks to Paul (job module implementation, get_ring > function, > >> a lot of architectural and implementation advice), Eric (finding many > bugs, > >> fixes for the bugs, and writing bulk copy and change feed tests), and > Jay > >> (testing and a thorough code review). > >> > >> Cheers, > >> -Nick > >> > >>> On Sun, Feb 17, 2019 at 2:32 AM Jan Lehnardt <m...@jan.io> wrote: > >>> > >>> Heya Nick, > >>> > >>> Nicely done. I think even though the majority of the discussion had > >>> already happened here, the RFC nicely pulled together the various > >>> discussion threads into a coherent whole. > >>> > >>> I would imagine the discussion on GH would be similarly fruitful. > >>> > >>> I gave it my +1, and as I said on the outset: I'm very excited about > this > >>> feature! > >>> > >>> Best > >>> Jan > >>> — > >>> > >>>> On 15. Feb 2019, at 23:45, Nick Vatamaniuc <vatam...@gmail.com> > wrote: > >>>> > >>>> Decided to kick the tires on the new RFC proposal issue type and > created > >>>> one for shard splitting: > >>>> > >>>> https://github.com/apache/couchdb/issues/1920 > >>>> > >>>> Let's see how it goes. Being it's the first one let me know if I > missed > >>>> anything obvious. > >>>> > >>>> Also I'd like to thank everyone who contributed to the discussion. The > >>> API > >>>> is looking more solid and is much improved from where it started. > >>>> > >>>> Cheers, > >>>> -Nick > >>>> > >>>> > >>>> > >>>>> On Wed, Feb 13, 2019 at 12:03 PM Nick Vatamaniuc <vatam...@gmail.com > > > >>> wrote: > >>>>> > >>>>> > >>>>> > >>>>>> On Wed, Feb 13, 2019 at 11:52 AM Jan Lehnardt <j...@apache.org> > wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>>> On 13. Feb 2019, at 17:12, Nick Vatamaniuc <vatam...@gmail.com> > >>> wrote: > >>>>>>> > >>>>>>> Hi Jan, > >>>>>>> > >>>>>>> Thanks for taking a look! > >>>>>>> > >>>>>>>> On Wed, Feb 13, 2019 at 6:28 AM Jan Lehnardt <j...@apache.org> > wrote: > >>>>>>>> > >>>>>>>> Nick, this is great, I have a few tiny nits left, apologies I only > >>> now > >>>>>> got > >>>>>>>> to it. > >>>>>>>> > >>>>>>>>> On 12. Feb 2019, at 18:08, Nick Vatamaniuc <vatam...@gmail.com> > >>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Shard Splitting API Proposal > >>>>>>>>> > >>>>>>>>> I'd like thank everyone who contributed to the API discussion. > As a > >>>>>>>> result > >>>>>>>>> we have a much better and consistent API that what we started > with. > >>>>>>>>> > >>>>>>>>> Before continuing I wanted to summarize to see what we ended up > >>> with. > >>>>>> The > >>>>>>>>> main changes since the initial proposal were switching to using > >>>>>> /_reshard > >>>>>>>>> as the main endpoint and having a detailed state transition > history > >>>>>> for > >>>>>>>>> jobs. > >>>>>>>>> > >>>>>>>>> * GET /_reshard > >>>>>>>>> > >>>>>>>>> Top level summary. Besides the new _reshard endpoint, there > `reason` > >>>>>> and > >>>>>>>>> the stats are more detailed. > >>>>>>>>> > >>>>>>>>> Returns > >>>>>>>>> > >>>>>>>>> { > >>>>>>>>> "completed": 3, > >>>>>>>>> "failed": 4, > >>>>>>>>> "running": 0, > >>>>>>>>> "state": "stopped", > >>>>>>>>> "state_reason": "Manual rebalancing", > >>>>>>>>> "stopped": 0, > >>>>>>>>> "total": 7 > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> * PUT /_reshard/state > >>>>>>>>> > >>>>>>>>> Start or stop global rebalacing. > >>>>>>>>> > >>>>>>>>> Body > >>>>>>>>> > >>>>>>>>> { > >>>>>>>>> "state": "stopped", > >>>>>>>>> "reason": "Manual rebalancing" > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> Returns > >>>>>>>>> > >>>>>>>>> { > >>>>>>>>> "ok": true > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> * GET /_reshard/state > >>>>>>>>> > >>>>>>>>> Return global resharding state and reason. > >>>>>>>>> > >>>>>>>>> { > >>>>>>>>> "reason": "Manual rebalancing", > >>>>>>>>> "state": “stopped” > >>>>>>>>> } > >>>>>>>> > >>>>>>>> More a note than a change request, but `state` is a very generic > term > >>>>>> that > >>>>>>>> often confuses folks when they are new to something. If the set of > >>>>>> possible > >>>>>>>> states is `started` and `stopped`, how about making this endpoint > a > >>>>>> boolean? > >>>>>>>> > >>>>>>>> /_reshard/enabled > >>>>>>>> > >>>>>>>> { > >>>>>>>> "enabled": true|false, > >>>>>>>> "reason": "Manual rebalancing" > >>>>>>>> } > >>>>>>>> > >>>>>>>> > >>>>>>> I thought of that as well. However _reshard/state seemed consistent > >>> with > >>>>>>> _reshard/jobs/$jobid/state. Setting "state":"stopped" > _reshard/state > >>>>>> will > >>>>>>> lead to all individual running job state to become "stopped" as > well. > >>>>>> And > >>>>>>> "running" will make jobs that are not individually stopped also > become > >>>>>>> "running". In other words since it directly toggle job's state > (with a > >>>>>> job > >>>>>>> being to override stopped state) I like that it had the same > arguments > >>>>>> > >>>>>> Got it, makes perfect sense. > >>>>>> > >>>>>>> and": true|false > >>>>>>> > >>>>>>> > >>>>>>>> > >>>>>>>>> * GET /_reshard/jobs > >>>>>>>>> > >>>>>>>>> Get the state of all the resharding jobs on the cluster. Now we > >>> have a > >>>>>>>>> detailed > >>>>>>>>> state transition history which looks similar what _scheduler/jobs > >>>>>> have. > >>>>>>>>> > >>>>>>>>> { > >>>>>>>>> "jobs": [ > >>>>>>>>> { > >>>>>>>>> "history": [ > >>>>>>>>> { > >>>>>>>>> "detail": null, > >>>>>>>>> "timestamp": "2019-02-06T22:28:06Z", > >>>>>>>>> "type": "new" > >>>>>>>>> }, > >>>>>>>>> ... > >>>>>>>>> { > >>>>>>>>> "detail": null, > >>>>>>>>> "timestamp": "2019-02-06T22:28:10Z", > >>>>>>>>> "type": "completed" > >>>>>>>>> } > >>>>>>>>> ], > >>>>>>>>> "id": > >>>>>>>>> > >>>>>> > "001-0a308ef9f7bd24bd4887d6e619682a6d3bb3d0fd94625866c5216ec1167b4e23", > >>>>>>>>> "job_state": "completed", > >>>>>>>>> "node": "node1@127.0.0.1", > >>>>>>>>> "source": "shards/00000000-ffffffff/db1.1549492084", > >>>>>>>>> "split_state": "completed", > >>>>>>>>> "start_time": "2019-02-06T22:28:06Z", > >>>>>>>>> "state_info": {}, > >>>>>>>>> "targets": [ > >>>>>>>>> "shards/00000000-7fffffff/db1.1549492084", > >>>>>>>>> "shards/80000000-ffffffff/db1.1549492084" > >>>>>>>>> ], > >>>>>>>> > >>>>>>>> Since we went from /_split to /_reshard to prepare for merging > >>> shards, > >>>>>> we > >>>>>>>> should reconsider source (singular) and targets (plural). Either a > >>>>>> merge > >>>>>>>> job (in the future) uses sources (plural) and target (singular) > and > >>>>>> the job > >>>>>>>> schema is intentionally different, or we unify things to, maybe > >>>>>> singular: > >>>>>>>> source/target which would have the nice property of being > analogous > >>> to > >>>>>> our > >>>>>>>> replication job schema. The type definition then is source:String > and > >>>>>>>> target:Array(2) for split jobs and source:Array(2) target:String > for > >>>>>>>> (future) merge jobs. > >>>>>>>> > >>>>>>>> > >>>>>>> Joan suggested adding a "type" field to both job creation POST body > >>> and > >>>>>>> also returning it when we inspect the job(s) state. So the > >>>>>> "type":"split" > >>>>>>> would toggle the schema. It could be "merge" in the future, or even > >>>>>>> something like "rebalance" where it would merge some and split > others > >>>>>>> perhaps :-) and since we have a type it would be easier to > >>> differentiate > >>>>>>> between the merge and split jobs. But if there is a consensus from > >>>>>> others > >>>>>>> about switching targets to target that's easily as well. > >>>>>> > >>>>>> Ah, I’m less concerned here about not being able to tell whether > it’s a > >>>>>> split or a merge, and more about that having an indiscriminate > plural > >>>>>> form (sourceS/targetS) depending on the type. It’s just an easy > thing > >>> to > >>>>>> get wrong. > >>>>>> > >>>>>> In addition, we already have source/target in CouchDB replication, > >>>>>> which people already use successfully, so making a similar thing > that > >>>>>> behaves slightly differently doesn’t sit quite right with me. > >>>>>> > >>>>>> I understand that I’m arguing to remove an ’s’ for very nitpicky > >>>>>> but these are the kind of nitpick discussions we’ve done a lot in > >>>>>> the early days which resulted in a by and large decent API that > >>>>>> has served as well, and it’s something I’d like to see taken > forward. > >>>>>> Apologies if this all sounds very strict ;) > >>>>>> > >>>>>> > >>>>> Thanks for the longer explanation. I understand now and agree, let's > >>> make > >>>>> it target. No worries about sounding nitpicky we should be nitpicky > >>> about > >>>>> APIs! > >>>>> > >>>>> > >>>>>>> > >>>>>>>> > >>>>>>>> And just another question, sorry if I missed this elsewhere, > would we > >>>>>> ever > >>>>>>>> consider adding to split/merge ratio different from 1:2, say 1:4, > or > >>>>>> will > >>>>>>>> folks have to run 1:2, 1:2, 1:2 to get to the same result? I’m > fine > >>>>>> with > >>>>>>>> either and if 1:2 fixed makes things simpler, I’m all for it ;) > >>>>>>>> > >>>>>>>> > >>>>>>> Good point. Actually it's already implemented that way already :-) > >>> Right > >>>>>>> below the API surface it has a split=2 parameter and it just > creates > >>> the > >>>>>>> targets based on that. It could be 2, 3, 4, ... 10 etc. However I > was > >>>>>>> thinking of keeping it hard coded at 2 at first to keep the > behavior > >>>>>>> simpler at first and open that parameter to be user facing in a > later > >>>>>>> release based on user feedback. > >>>>>> > >>>>>> Ace, again, fully on board with shipping 1:2 first and maybe > offering > >>>>>> other > >>>>>> options later. > >>>>>> > >>>>>> Best > >>>>>> Jan > >>>>>> — > >>>>>> > >>>>>>> > >>>>>>> Cheers, > >>>>>>> > >>>>>>> -Nick > >>>>>> > >>>>>> > >>> > >>> > > > > -- > Professional Support for Apache CouchDB: > https://neighbourhood.ie/couchdb-support/ > >