Hi Nick, despite https://twitter.com/janl/status/1108741392975777795 left one typo comment ;)
Great job everyone! Jan — > On 21. Mar 2019, at 15:44, Jan Lehnardt <j...@apache.org> wrote: > > Hi Nick, > > On first glance, this looks all great and like an exemplary PR that is easy > to follow. And bonus props for the nice docs. I'll have more time for a > through review over the weekend. > > Cheers > Jan > — > >> On 18. Mar 2019, at 19:42, Nick Vatamaniuc <vatam...@gmail.com> wrote: >> >> Hello everyone, >> >> Thank you all (Joan, Jan, Mike, Ilya, Adam) who contributed to the API >> discussion. There is now a PR open >> https://github.com/apache/couchdb/pull/1972 . If you get a chance, I would >> appreciate any reviews, feedback or comments. >> >> The PR message explains how the commits are organized and references the >> RFC. Basically it starts with preparatory work, ensuring all the existing >> components know how to deal with split shards. Then, some lower level bits >> are implemented, like bulk copy, internal replicator updates, etc., >> followed by the individual job implementation and the job manager which >> stitches everything together. In the end is the HTTP API implementation >> along with a suite of unit and Elixir integration tests. >> >> There is also a README_reshard.md file in src/mem3 that tries to provide a >> more in-depth technical description of how everything fits together. >> https://github.com/apache/couchdb/pull/1972/files#diff-5ac7b51ec4e03e068bf271f34ecf88df >> (notice >> this URL might changer after a rebase). >> >> Also special thanks to Paul (job module implementation, get_ring function, >> a lot of architectural and implementation advice), Eric (finding many bugs, >> fixes for the bugs, and writing bulk copy and change feed tests), and Jay >> (testing and a thorough code review). >> >> Cheers, >> -Nick >> >>> On Sun, Feb 17, 2019 at 2:32 AM Jan Lehnardt <m...@jan.io> wrote: >>> >>> Heya Nick, >>> >>> Nicely done. I think even though the majority of the discussion had >>> already happened here, the RFC nicely pulled together the various >>> discussion threads into a coherent whole. >>> >>> I would imagine the discussion on GH would be similarly fruitful. >>> >>> I gave it my +1, and as I said on the outset: I'm very excited about this >>> feature! >>> >>> Best >>> Jan >>> — >>> >>>> On 15. Feb 2019, at 23:45, Nick Vatamaniuc <vatam...@gmail.com> wrote: >>>> >>>> Decided to kick the tires on the new RFC proposal issue type and created >>>> one for shard splitting: >>>> >>>> https://github.com/apache/couchdb/issues/1920 >>>> >>>> Let's see how it goes. Being it's the first one let me know if I missed >>>> anything obvious. >>>> >>>> Also I'd like to thank everyone who contributed to the discussion. The >>> API >>>> is looking more solid and is much improved from where it started. >>>> >>>> Cheers, >>>> -Nick >>>> >>>> >>>> >>>>> On Wed, Feb 13, 2019 at 12:03 PM Nick Vatamaniuc <vatam...@gmail.com> >>> wrote: >>>>> >>>>> >>>>> >>>>>> On Wed, Feb 13, 2019 at 11:52 AM Jan Lehnardt <j...@apache.org> wrote: >>>>>> >>>>>> >>>>>> >>>>>>> On 13. Feb 2019, at 17:12, Nick Vatamaniuc <vatam...@gmail.com> >>> wrote: >>>>>>> >>>>>>> Hi Jan, >>>>>>> >>>>>>> Thanks for taking a look! >>>>>>> >>>>>>>> On Wed, Feb 13, 2019 at 6:28 AM Jan Lehnardt <j...@apache.org> wrote: >>>>>>>> >>>>>>>> Nick, this is great, I have a few tiny nits left, apologies I only >>> now >>>>>> got >>>>>>>> to it. >>>>>>>> >>>>>>>>> On 12. Feb 2019, at 18:08, Nick Vatamaniuc <vatam...@gmail.com> >>>>>> wrote: >>>>>>>>> >>>>>>>>> Shard Splitting API Proposal >>>>>>>>> >>>>>>>>> I'd like thank everyone who contributed to the API discussion. As a >>>>>>>> result >>>>>>>>> we have a much better and consistent API that what we started with. >>>>>>>>> >>>>>>>>> Before continuing I wanted to summarize to see what we ended up >>> with. >>>>>> The >>>>>>>>> main changes since the initial proposal were switching to using >>>>>> /_reshard >>>>>>>>> as the main endpoint and having a detailed state transition history >>>>>> for >>>>>>>>> jobs. >>>>>>>>> >>>>>>>>> * GET /_reshard >>>>>>>>> >>>>>>>>> Top level summary. Besides the new _reshard endpoint, there `reason` >>>>>> and >>>>>>>>> the stats are more detailed. >>>>>>>>> >>>>>>>>> Returns >>>>>>>>> >>>>>>>>> { >>>>>>>>> "completed": 3, >>>>>>>>> "failed": 4, >>>>>>>>> "running": 0, >>>>>>>>> "state": "stopped", >>>>>>>>> "state_reason": "Manual rebalancing", >>>>>>>>> "stopped": 0, >>>>>>>>> "total": 7 >>>>>>>>> } >>>>>>>>> >>>>>>>>> * PUT /_reshard/state >>>>>>>>> >>>>>>>>> Start or stop global rebalacing. >>>>>>>>> >>>>>>>>> Body >>>>>>>>> >>>>>>>>> { >>>>>>>>> "state": "stopped", >>>>>>>>> "reason": "Manual rebalancing" >>>>>>>>> } >>>>>>>>> >>>>>>>>> Returns >>>>>>>>> >>>>>>>>> { >>>>>>>>> "ok": true >>>>>>>>> } >>>>>>>>> >>>>>>>>> * GET /_reshard/state >>>>>>>>> >>>>>>>>> Return global resharding state and reason. >>>>>>>>> >>>>>>>>> { >>>>>>>>> "reason": "Manual rebalancing", >>>>>>>>> "state": “stopped” >>>>>>>>> } >>>>>>>> >>>>>>>> More a note than a change request, but `state` is a very generic term >>>>>> that >>>>>>>> often confuses folks when they are new to something. If the set of >>>>>> possible >>>>>>>> states is `started` and `stopped`, how about making this endpoint a >>>>>> boolean? >>>>>>>> >>>>>>>> /_reshard/enabled >>>>>>>> >>>>>>>> { >>>>>>>> "enabled": true|false, >>>>>>>> "reason": "Manual rebalancing" >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>> I thought of that as well. However _reshard/state seemed consistent >>> with >>>>>>> _reshard/jobs/$jobid/state. Setting "state":"stopped" _reshard/state >>>>>> will >>>>>>> lead to all individual running job state to become "stopped" as well. >>>>>> And >>>>>>> "running" will make jobs that are not individually stopped also become >>>>>>> "running". In other words since it directly toggle job's state (with a >>>>>> job >>>>>>> being to override stopped state) I like that it had the same arguments >>>>>> >>>>>> Got it, makes perfect sense. >>>>>> >>>>>>> and": true|false >>>>>>> >>>>>>> >>>>>>>> >>>>>>>>> * GET /_reshard/jobs >>>>>>>>> >>>>>>>>> Get the state of all the resharding jobs on the cluster. Now we >>> have a >>>>>>>>> detailed >>>>>>>>> state transition history which looks similar what _scheduler/jobs >>>>>> have. >>>>>>>>> >>>>>>>>> { >>>>>>>>> "jobs": [ >>>>>>>>> { >>>>>>>>> "history": [ >>>>>>>>> { >>>>>>>>> "detail": null, >>>>>>>>> "timestamp": "2019-02-06T22:28:06Z", >>>>>>>>> "type": "new" >>>>>>>>> }, >>>>>>>>> ... >>>>>>>>> { >>>>>>>>> "detail": null, >>>>>>>>> "timestamp": "2019-02-06T22:28:10Z", >>>>>>>>> "type": "completed" >>>>>>>>> } >>>>>>>>> ], >>>>>>>>> "id": >>>>>>>>> >>>>>> "001-0a308ef9f7bd24bd4887d6e619682a6d3bb3d0fd94625866c5216ec1167b4e23", >>>>>>>>> "job_state": "completed", >>>>>>>>> "node": "node1@127.0.0.1", >>>>>>>>> "source": "shards/00000000-ffffffff/db1.1549492084", >>>>>>>>> "split_state": "completed", >>>>>>>>> "start_time": "2019-02-06T22:28:06Z", >>>>>>>>> "state_info": {}, >>>>>>>>> "targets": [ >>>>>>>>> "shards/00000000-7fffffff/db1.1549492084", >>>>>>>>> "shards/80000000-ffffffff/db1.1549492084" >>>>>>>>> ], >>>>>>>> >>>>>>>> Since we went from /_split to /_reshard to prepare for merging >>> shards, >>>>>> we >>>>>>>> should reconsider source (singular) and targets (plural). Either a >>>>>> merge >>>>>>>> job (in the future) uses sources (plural) and target (singular) and >>>>>> the job >>>>>>>> schema is intentionally different, or we unify things to, maybe >>>>>> singular: >>>>>>>> source/target which would have the nice property of being analogous >>> to >>>>>> our >>>>>>>> replication job schema. The type definition then is source:String and >>>>>>>> target:Array(2) for split jobs and source:Array(2) target:String for >>>>>>>> (future) merge jobs. >>>>>>>> >>>>>>>> >>>>>>> Joan suggested adding a "type" field to both job creation POST body >>> and >>>>>>> also returning it when we inspect the job(s) state. So the >>>>>> "type":"split" >>>>>>> would toggle the schema. It could be "merge" in the future, or even >>>>>>> something like "rebalance" where it would merge some and split others >>>>>>> perhaps :-) and since we have a type it would be easier to >>> differentiate >>>>>>> between the merge and split jobs. But if there is a consensus from >>>>>> others >>>>>>> about switching targets to target that's easily as well. >>>>>> >>>>>> Ah, I’m less concerned here about not being able to tell whether it’s a >>>>>> split or a merge, and more about that having an indiscriminate plural >>>>>> form (sourceS/targetS) depending on the type. It’s just an easy thing >>> to >>>>>> get wrong. >>>>>> >>>>>> In addition, we already have source/target in CouchDB replication, >>>>>> which people already use successfully, so making a similar thing that >>>>>> behaves slightly differently doesn’t sit quite right with me. >>>>>> >>>>>> I understand that I’m arguing to remove an ’s’ for very nitpicky >>>>>> but these are the kind of nitpick discussions we’ve done a lot in >>>>>> the early days which resulted in a by and large decent API that >>>>>> has served as well, and it’s something I’d like to see taken forward. >>>>>> Apologies if this all sounds very strict ;) >>>>>> >>>>>> >>>>> Thanks for the longer explanation. I understand now and agree, let's >>> make >>>>> it target. No worries about sounding nitpicky we should be nitpicky >>> about >>>>> APIs! >>>>> >>>>> >>>>>>> >>>>>>>> >>>>>>>> And just another question, sorry if I missed this elsewhere, would we >>>>>> ever >>>>>>>> consider adding to split/merge ratio different from 1:2, say 1:4, or >>>>>> will >>>>>>>> folks have to run 1:2, 1:2, 1:2 to get to the same result? I’m fine >>>>>> with >>>>>>>> either and if 1:2 fixed makes things simpler, I’m all for it ;) >>>>>>>> >>>>>>>> >>>>>>> Good point. Actually it's already implemented that way already :-) >>> Right >>>>>>> below the API surface it has a split=2 parameter and it just creates >>> the >>>>>>> targets based on that. It could be 2, 3, 4, ... 10 etc. However I was >>>>>>> thinking of keeping it hard coded at 2 at first to keep the behavior >>>>>>> simpler at first and open that parameter to be user facing in a later >>>>>>> release based on user feedback. >>>>>> >>>>>> Ace, again, fully on board with shipping 1:2 first and maybe offering >>>>>> other >>>>>> options later. >>>>>> >>>>>> Best >>>>>> Jan >>>>>> — >>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> -Nick >>>>>> >>>>>> >>> >>> > -- Professional Support for Apache CouchDB: https://neighbourhood.ie/couchdb-support/