Re: IO queueing defaults

2019-12-18 Thread Adam Kocoloski
How time flies … this is still one of the last remaining issues for 3.0. I’ve 
written up a modified version of this idea that I think captures the same basic 
spirit as a PR against the couchdb-documentation repo:

https://github.com/apache/couchdb-documentation/pull/464

I tried to document IOQ1 but its configuration is frankly not very coherent. In 
the above PR I’ve instead proposed an alternative configuration that I think is 
both relatively simple for users and also achievable with a conservative patch 
against the version of IOQ present in our last release. It ignores the more 
complex code from Cloudant; I don’t think that’s the right fit for CouchDB at 
this time. The approach I’m proposing should allow incrementally more control 
over IO prioritization than we had in 2.x and also enable us to ship a 
relatively fast default configuration.

Adam

> On Oct 9, 2019, at 4:34 PM, Adam Kocoloski  wrote:
> 
> After thinking this through a bit more I propose to do the following:
> 
> 1) Fully document IOQ1 as the workload management / IO queueing capability in 
> CouchDB
> 2) Enable IOQ1 by default
> 3) Add a global bypass switch so users with big, fast servers can quickly 
> configure CouchDB to make the most of that hardware
> 
> IOQ2 will still be included in the codebase but not publicly documented. 
> Interested parties can continue to refine and simplify it and we can consider 
> cutting over to it in a future 3.x build.
> 
> I think this is a conservative “do no harm” approach that will result in a 
> similar performance profile as 2.x out of the box while delivering a couple 
> of extra knobs to refine the workload management, or bypass it altogether in 
> the name of performance.
> 
> Adam
> 
>> On Sep 16, 2019, at 11:58 AM, Adam Kocoloski  wrote:
>> 
>> Maybe it makes sense to look at the 2.x -> 3.x progression of each of these 
>> individually:
>> 
>> ## Compaction
>> 
>> Smoosh replaces an earlier compaction daemon. It can certainly be configured 
>> to use more resources than the old one. Changing the default configuration 
>> to a single channel with no parallelism would I think put it more in line 
>> with 2.x. https://github.com/apache/couchdb-smoosh/pull/3 restores the 
>> ability to scope compaction to certain hours of the day which is the other 
>> big gap.
>> 
>> ## View Builds
>> 
>> Does 2.x have a built-in background view updater? I didn’t think so. Ken 
>> could cause a lot of IO to show up unexpectedly, for sure. The daemon 
>> doesn’t have a global on/off switch at the moment.
>> 
>> ## IO Queueing
>> 
>> 2.x has an undocumented IOQ implementation. If I’m reading the code 
>> correctly it de-prioritized compaction IO and otherwise dumps everything 
>> (including view updates) into a single queue. The architecture is otherwise 
>> similar to what I called IOQ1 in my original email. It does not appear 
>> possible to bypass the queueing system in this version. Tracing back to the 
>> original COUCHDB-1775 issue in JIRA one finds
>> 
>>> Note: For demonstration purposes at the moment, the code is likely too slow 
>>> for production use.
>> 
>> And yet, as far as I can tell this is substantively the same code that’s 
>> been in production for the entire 2.x line …
>> 
>> —
>> 
>> Knowing that our users have lived with the IOQ1 performance ceiling for all 
>> of 2.x does change my perspective on the options. I agree that we shouldn’t 
>> bypass the whole thing at this juncture, especially not if we’re making it 
>> easy to crank up more background jobs. At the same time I’m really reticent 
>> to introduce a whole bunch of knobs and dials. I’m not sure where to go from 
>> here but maybe others will find the background details above to be helpful.
>> 
>> Adam
>> 
>>> On Sep 14, 2019, at 3:10 PM, Joan Touzet  wrote:
>>> 
>>> On 2019-09-12 6:00 a.m., Will Holley wrote:
 I defer to those with more operational experience of ken and smoosh but
 wouldn't those new subsystems radically impact performance if IOQ is
 completely bypassed (assuming ken/smoosh are enabled by default)?
>>> 
>>> A very good point. I'd be uncomfortable with a ken+smoosh+IOQ1 combination 
>>> without safeguards of some sort - a modified version of 2 I guess.
>>> 
>>> Disabling those daemons by default is a regression from 2.x so I don't 
>>> consider that a realistic option, either.
>>> 
>>> We want CouchDB 3.x to be "the best home-grown clustered CouchDB 
>>> available," and completely disabling IOQ sounds like not that.
>>> 
>>> I guess my preferences in order are 1, 2, 3.
>>> 
>>> -Joan
>>> 
 On Wed, 11 Sep 2019 at 22:04, Adam Kocoloski  wrote:
> A few months ago a bunch of code landed on master around IO QoS and
> prioritization. I think we need to have a conversation about the defaults
> for that system and what we want to allow users to enable.
> 
> First topic - there are actually two different generations of the IOQ
> system: IOQ and IOQ2. Only one can be active at a 

Re: Request: Committers, delete your old branches on apache/couchdb!

2019-12-18 Thread Ilya Khlopotov
Hi Paul,

Not sure if it would help you. I have following git alias:

```
[alias]
rmmerged = !sh -c \"git branch --merged | grep -v '^* master$' | grep 
-v '^  master$' | xargs git branch -d\"
```

Also we might want to change the setting for repository to delete merged 
branches.

BR,
iilyak

On 2019/12/18 15:43:28, Paul Davis  wrote: 
> I noticed that there are a lot of branches pointing at commits that
> have been merged to master. I'll hack up a quick script today that
> will go through all branches and delete anything that's been merged.
> I'll write out the specific branch/sha combinations and report them
> here in case anyone really needs a branch I end up deleting.
> 
> On Tue, Dec 17, 2019 at 6:35 PM Joan Touzet  wrote:
> >
> > Bump. Please do this.
> >
> > On 2019-12-13 12:15 p.m., Joan Touzet wrote:
> > > Hi again committers,
> > >
> > > Our proto-Jenkins setup is currently using my API token to scan GitHub
> > > for branches. GitHub has notoriously low values for how many API calls
> > > you can make per hour.
> > >
> > > Unfortunately, because multibranch pipeline jobs scan *all* branches in
> > > a repo for changes, this means that the total number of jobs we can run
> > > in an hour is pretty small.
> > >
> > > I'm asking everyone to please go and delete any obsolete or unused
> > > branches on the apache/couchdb repo that you don't need. I'm too nervous
> > > to mass delete other people's branches, but if I see anything that got
> > > merged and is more than a few years old, it'll probably get wiped.
> > >
> > > Thanks,
> > > Joan "needs sleep" Touzet
> > >
> 


Re: Request: Committers, delete your old branches on apache/couchdb!

2019-12-18 Thread Paul Davis
I noticed that there are a lot of branches pointing at commits that
have been merged to master. I'll hack up a quick script today that
will go through all branches and delete anything that's been merged.
I'll write out the specific branch/sha combinations and report them
here in case anyone really needs a branch I end up deleting.

On Tue, Dec 17, 2019 at 6:35 PM Joan Touzet  wrote:
>
> Bump. Please do this.
>
> On 2019-12-13 12:15 p.m., Joan Touzet wrote:
> > Hi again committers,
> >
> > Our proto-Jenkins setup is currently using my API token to scan GitHub
> > for branches. GitHub has notoriously low values for how many API calls
> > you can make per hour.
> >
> > Unfortunately, because multibranch pipeline jobs scan *all* branches in
> > a repo for changes, this means that the total number of jobs we can run
> > in an hour is pretty small.
> >
> > I'm asking everyone to please go and delete any obsolete or unused
> > branches on the apache/couchdb repo that you don't need. I'm too nervous
> > to mass delete other people's branches, but if I see anything that got
> > merged and is more than a few years old, it'll probably get wiped.
> >
> > Thanks,
> > Joan "needs sleep" Touzet
> >