Re: to batch or not to batch?
On Mon, Feb 19, 2018 at 04:02:56PM +0100, Kalev Lember wrote: > On 02/19/2018 03:40 PM, Randy Barlow wrote: > > On 02/19/2018 06:34 AM, Vít Ondruch wrote: > >> But anyway, why don't we > >> have "updates-testing", "updates-batched" and "updates" repositories? > >> "updates-batched" could be enabled by default while "updates" could be > >> enabled manually if one wishes. > > > > When we talked about this in the past I believe resources were the > > concern, both on the Fedora infrastructure side and on the mirroring side. > > I think a new, additional repo with batched contents should solve a lot > of problems here: we'd have both side of the fence happy (regular users: > batched updates, developers: continuous stream of updates), and it would > also make it possible to actually QA the batched set before it hits the > mirrors. Those are good points. I'd also add that since this removes the reason why developers were overriding batched to push to stable, hopefully we would see more packages go to batched. > Right now there's no way for QA to only extract batched updates without > getting all the rest of updates-testing; if we had that we could > actually have people test the batched set of updates before they are > pushed out to stable. That answer another doubt that people had. Zbyszek > P.S. Those who were at the Atomic Workstation call that ended a few > minutes ago, I believe this is what we need to solve the split updates > problem that we talked about. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: to batch or not to batch?
On 02/19/2018 03:40 PM, Randy Barlow wrote: > On 02/19/2018 06:34 AM, Vít Ondruch wrote: >> But anyway, why don't we >> have "updates-testing", "updates-batched" and "updates" repositories? >> "updates-batched" could be enabled by default while "updates" could be >> enabled manually if one wishes. > > When we talked about this in the past I believe resources were the > concern, both on the Fedora infrastructure side and on the mirroring side. I think a new, additional repo with batched contents should solve a lot of problems here: we'd have both side of the fence happy (regular users: batched updates, developers: continuous stream of updates), and it would also make it possible to actually QA the batched set before it hits the mirrors. Right now there's no way for QA to only extract batched updates without getting all the rest of updates-testing; if we had that we could actually have people test the batched set of updates before they are pushed out to stable. P.S. Those who were at the Atomic Workstation call that ended a few minutes ago, I believe this is what we need to solve the split updates problem that we talked about. -- Kalev ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: to batch or not to batch?
On 02/19/2018 06:34 AM, Vít Ondruch wrote: > But anyway, why don't we > have "updates-testing", "updates-batched" and "updates" repositories? > "updates-batched" could be enabled by default while "updates" could be > enabled manually if one wishes. When we talked about this in the past I believe resources were the concern, both on the Fedora infrastructure side and on the mirroring side. signature.asc Description: OpenPGP digital signature ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: to batch or not to batch?
I am using Rawhide, so I personally don't care. But anyway, why don't we have "updates-testing", "updates-batched" and "updates" repositories? "updates-batched" could be enabled by default while "updates" could be enabled manually if one wishes. Vít Dne 17.2.2018 v 23:15 Zbigniew Jędrzejewski-Szmek napsal(a): > Bodhi currently provides "batched updates" [1] which lump updates of > packages that are not marked urgent into a single batch, released once > per week. This means that after an update has graduated from testing, > it may be delayed up to a week before it becomes available to users. > > Batching is now the default, but maintainers can push theirs updates > to stable, overriding this default, and make the update available the > next day. > > Batching is liked by some maintainers, but hated by others > Unfortunately, the positive effects of batching are strongly > decreased when many packages are not batched. Thus, we should settle > on a single policy — either batch as much as possible, or turn > batching off. Having the middle ground of some batching is not very > effective and still annoys people who don't like batching. > > To summarize the ups (+) and downs (-): > > + batching reduces the number of times repository metadata is updated. > Each metadata update results in dnf downloading about 20-40 mb, > which is expensive and/or slow for users with low bandwidth. > > + a constant stream of metadata updates also puts strain on our mirrors. > > + a constant stream of updates feels overwhelming to users, and a > predictable once-per-week batch is perceived as easier. In > particular corporate users might adapt to this and use it to > schedule an update of all machines at fixed times. > > + a batch of updates may be tested as one, and, at least in principle, > if users then install this batch as one, QA that was done on the > batch matches the user systems more closely, compared to QA testing > package updates one by one as they come in, and users updating them > at a slightly different schedule. > > - batching delays updates of packages between 0 and 7 days after > they have reached karma and makes it hard for people to immediately > install updates when they graduate from testing. > > - some users (or maybe it's just maintainers?) actually prefer a > constant stream of small updates, and find it easier to read > changelogs and pinpoint regressions, etc. a few packages at a time. > > - batching (when done on the "server" side) interferes with clients > applying their own batching policy. This has two aspects: > clients might want to pick a different day of the week or an > altogether different schedule, > clients might want to pick a different policy of updates, e.g. to > allow any updates for specific packages to go through, etc. > > In particular gnome-software implements its own style of batching, where > it will suggest an update only once per week, unless there are security > updates. > > Unfortunately there isn't much data on the effects of batching. > Kevin posted some [2], as did the other Kevin [3] ;), but we certainly > could use more detailed stats. > > One of the positive aspects of batching — reduction in metadata downloads, > might be obsoleted by improving download efficiency through delta downloads. > A proof-of-concept has been implemented [4]. > > Second positive aspect of batching — doing updates in batches at a > fixed schedule, may just as well be implemented on the client side, > although that does not recreate the testing on the whole batch, since > now every client it doing it at a different time. It's not clear though > if this additional testing is actually useful. > > There's an open FESCo ticket to "adjust/drop/document" batching [5]. > That discussion has not been effective, because this issue has many > aspects, and depending on priorities, the view on batching is likely to > be different. FESCo is trying to gather more data and get a better > understanding of what maintainers consider more important. > > Did I miss something on the plus or minus side? Or some good statistics? > Does patching make Fedora seem more approachable to end-users? > (this is a question in particular for Matthew Miller who pushed for batching.) > Do the benefits of batching outweigh the downsides? > Should we keep batching as an interim measure until delta downloads are > implemented? > Should dnf offer smart batched updates like gnome-software? > Should we encourage maintainers to allow their updates to be batched? > > [1] https://github.com/fedora-infra/bodhi/issues/1157, > > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/UDXVXLT7JXCY6N7NRACN4GBS3KA6D4M6/ > [2] > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/B6MMH3L36A2YXQ45Y4DUGMR4XIG7QKE5/ > [3] > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/F36YMWKDXBHAQWQOLDSYLYTMDF4WYHE6/
Re: to batch or not to batch?
On Sun, Feb 18, 2018 at 11:40:47AM +0100, Fabio Valentini wrote: > On Sat, Feb 17, 2018 at 11:15 PM, Zbigniew Jędrzejewski-Szmek >wrote: > > Bodhi currently provides "batched updates" [1] which lump updates of > > packages that are not marked urgent into a single batch, released once > > per week. This means that after an update has graduated from testing, > > it may be delayed up to a week before it becomes available to users. > > > > Batching is now the default, but maintainers can push theirs updates > > to stable, overriding this default, and make the update available the > > next day. > > > > Batching is liked by some maintainers, but hated by others > > Unfortunately, the positive effects of batching are strongly > > decreased when many packages are not batched. Thus, we should settle > > on a single policy — either batch as much as possible, or turn > > batching off. Having the middle ground of some batching is not very > > effective and still annoys people who don't like batching. > > (snip) > > > To summarize the ups (+) and downs (-): > > > > + batching reduces the number of times repository metadata is updated. > > Each metadata update results in dnf downloading about 20-40 mb, > > which is expensive and/or slow for users with low bandwidth. > > This savings effect is negligible, because metadata has to be updated > even if only 1 urgent security update is pushed to stable. [FTR, it's any urgent update, security or not.] Yes, but we don't have urgent updates every day. Even if we have them every other day, that'd still be 50% reduction in metadata downloads. > > + a constant stream of metadata updates also puts strain on our mirrors. > > > > + a constant stream of updates feels overwhelming to users, and a > > predictable once-per-week batch is perceived as easier. In > > particular corporate users might adapt to this and use it to > > schedule an update of all machines at fixed times. > > I'd rather want to see a small batch of updates more frequently than a > large batch that I won't care to read through. Yes, but I think you are in the minority. I'm pretty sure most users don't bother reading descriptions. Of course it's hard to gauge this, but I'd expect that with the frequency of updates in Fedora only very dedicated admins can look at every package and every changelog. Most people install the whole set and investigate only if something goes wrong. > > + a batch of updates may be tested as one, and, at least in principle, > > if users then install this batch as one, QA that was done on the > > batch matches the user systems more closely, compared to QA testing > > package updates one by one as they come in, and users updating them > > at a slightly different schedule. > > Well, is any such testing of the "batched state" being done, and if it > is, does it influence which packages get pushed to stable? Sorry, I don't think we have any data on this. Maybe adamw and other QA people can pitch in? > > - batching delays updates of packages between 0 and 7 days after > > they have reached karma and makes it hard for people to immediately > > install updates when they graduate from testing. > > This delay can be circumvented by maintainers by pushing directly to > stable instead of batched (thereby rendering the batched state > obsolete, however). I meant that it is hard for *end-users*. Essentially, end users lose the control of the timing, even though individual maintainers can still control the timing of their updates. > > - some users (or maybe it's just maintainers?) actually prefer a > > constant stream of small updates, and find it easier to read > > changelogs and pinpoint regressions, etc. a few packages at a time. > > I certainly belong to this group. > > > - batching (when done on the "server" side) interferes with clients > > applying their own batching policy. This has two aspects: > > clients might want to pick a different day of the week or an > > altogether different schedule, > > clients might want to pick a different policy of updates, e.g. to > > allow any updates for specific packages to go through, etc. > > > > In particular gnome-software implements its own style of batching, where > > it will suggest an update only once per week, unless there are security > > updates. > > Which further delays the distribution of stable updates by up to a > week (depending on the schedule of gnome-software, I didn't check > that). That makes a total of up to 3 weeks (!). > > > Unfortunately there isn't much data on the effects of batching. > > Kevin posted some [2], as did the other Kevin [3] ;), but we certainly > > could use more detailed stats. > > > > One of the positive aspects of batching — reduction in metadata downloads, > > might be obsoleted by improving download efficiency through delta downloads. > > A proof-of-concept has been implemented [4]. > > A simpler approach might be to just flush all
Re: to batch or not to batch?
> Batching is now the default, but maintainers can push theirs updates > to stable, overriding this default, and make the update available the > next day. I think that since "batch override" doesn't push the package immediately, but rather schedules it for the next day, I agree with Fabio that it might be a good idea to flush the "batch queue" when a package is explicitly pushed to stable by someone. This won't increase the number of metadata expirations - so there isn't really any drawback to end users - while allowing updates to reach users faster. > + batching reduces the number of times repository metadata is updated. > Each metadata update results in dnf downloading about 20-40 mb, > which is expensive and/or slow for users with low bandwidth. As someone with a rather small data cap, I'd say that heavy metadata downloads during "dnf update" are acceptable - since I can just choose to run "dnf update" only once a week or so. But it always irks me a bit when I want to install a new package and dnf starts downloading the repository metadata again. Bandwidth issues aside, it's just incredibly annoying having to wait for a 40MiB download to complete before I can fetch a single 600KiB package. > - batching delays updates of packages between 0 and 7 days after > they have reached karma and makes it hard for people to immediately > install updates when they graduate from testing. I agree with Jerry here - many packages don't get any karma while testing. The only time my packages received testing karma was when I was introducing new packages; didn't happen for updates. So having the package sit in limbo for another week after going through a week of "maybe someone'll take a look at this" is a bit discouraging. > One of the positive aspects of batching — reduction in metadata downloads, > might be obsoleted by improving download efficiency through delta downloads. > A proof-of-concept has been implemented [4]. This could be a rather interesting feature, as it'd resolve some of the issues I wrote two paragraphs above. By the way - does drpm handling depend on repo / mirror settings? I ask because I'm under the impression that lately hardly any package update on my system is done via delta-RPMs; it's about 1-in-100 or so. Is this more a matter of me needing to tweak dnf config, or can this depend on the package mirrors? A.I. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: to batch or not to batch?
On 02/17/2018 11:15 PM, Zbigniew Jędrzejewski-Szmek wrote: Bodhi currently provides "batched updates" [1] which lump updates of packages that are not marked urgent into a single batch, released once per week. This means that after an update has graduated from testing, it may be delayed up to a week before it becomes available to users. Batching is now the default, but maintainers can push theirs updates to stable, overriding this default, and make the update available the next day. Batching is liked by some maintainers, but hated by others As a maintainer, I hate them, because the primary effect "batched" has, is to furtherly increase update delay from 1 week up to 2 weeks or more (like in recent times). That said, I consider "batched" to be superfluous bureaucracy. Ralf ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: to batch or not to batch?
> Did I miss something on the plus or minus side? + Without batched updates, running “dnf update” gives a variable reward, like clicking refresh on Facebook or playing a fruit machine. Accumulating updates into a batch gives a calmer, more ordered experience. -- Peter Oliver ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Re: to batch or not to batch?
On Sat, Feb 17, 2018 at 11:15 PM, Zbigniew Jędrzejewski-Szmekwrote: > Bodhi currently provides "batched updates" [1] which lump updates of > packages that are not marked urgent into a single batch, released once > per week. This means that after an update has graduated from testing, > it may be delayed up to a week before it becomes available to users. > > Batching is now the default, but maintainers can push theirs updates > to stable, overriding this default, and make the update available the > next day. > > Batching is liked by some maintainers, but hated by others > Unfortunately, the positive effects of batching are strongly > decreased when many packages are not batched. Thus, we should settle > on a single policy — either batch as much as possible, or turn > batching off. Having the middle ground of some batching is not very > effective and still annoys people who don't like batching. (snip) > To summarize the ups (+) and downs (-): > > + batching reduces the number of times repository metadata is updated. > Each metadata update results in dnf downloading about 20-40 mb, > which is expensive and/or slow for users with low bandwidth. This savings effect is negligible, because metadata has to be updated even if only 1 urgent security update is pushed to stable. > + a constant stream of metadata updates also puts strain on our mirrors. > > + a constant stream of updates feels overwhelming to users, and a > predictable once-per-week batch is perceived as easier. In > particular corporate users might adapt to this and use it to > schedule an update of all machines at fixed times. I'd rather want to see a small batch of updates more frequently than a large batch that I won't care to read through. > + a batch of updates may be tested as one, and, at least in principle, > if users then install this batch as one, QA that was done on the > batch matches the user systems more closely, compared to QA testing > package updates one by one as they come in, and users updating them > at a slightly different schedule. Well, is any such testing of the "batched state" being done, and if it is, does it influence which packages get pushed to stable? > - batching delays updates of packages between 0 and 7 days after > they have reached karma and makes it hard for people to immediately > install updates when they graduate from testing. This delay can be circumvented by maintainers by pushing directly to stable instead of batched (thereby rendering the batched state obsolete, however). > - some users (or maybe it's just maintainers?) actually prefer a > constant stream of small updates, and find it easier to read > changelogs and pinpoint regressions, etc. a few packages at a time. I certainly belong to this group. > - batching (when done on the "server" side) interferes with clients > applying their own batching policy. This has two aspects: > clients might want to pick a different day of the week or an > altogether different schedule, > clients might want to pick a different policy of updates, e.g. to > allow any updates for specific packages to go through, etc. > > In particular gnome-software implements its own style of batching, where > it will suggest an update only once per week, unless there are security > updates. Which further delays the distribution of stable updates by up to a week (depending on the schedule of gnome-software, I didn't check that). That makes a total of up to 3 weeks (!). > Unfortunately there isn't much data on the effects of batching. > Kevin posted some [2], as did the other Kevin [3] ;), but we certainly > could use more detailed stats. > > One of the positive aspects of batching — reduction in metadata downloads, > might be obsoleted by improving download efficiency through delta downloads. > A proof-of-concept has been implemented [4]. A simpler approach might be to just flush all batched updates to stable if there is at least one update (possibly an urgent security update) anyway. That way, the metadata don't have to be downloaded for just one update, and all packages reach stable sooner. > Second positive aspect of batching — doing updates in batches at a > fixed schedule, may just as well be implemented on the client side, > although that does not recreate the testing on the whole batch, since > now every client it doing it at a different time. It's not clear though > if this additional testing is actually useful. Well, the whole testing/installing batches of updates sounds a lot like what Atomic Workstation is doing (which I really like). However, forcing the same kind of process onto the current way of doing things (with individual updates and packages) doesn't seem to make anybody happy right now ... > There's an open FESCo ticket to "adjust/drop/document" batching [5]. > That discussion has not been effective, because this issue has many > aspects, and depending on priorities, the view on batching is
Re: to batch or not to batch?
On Sat, Feb 17, 2018 at 3:15 PM, Zbigniew Jędrzejewski-Szmekwrote: > Did I miss something on the plus or minus side? Or some good statistics? I thought that was a pretty good summary of the ups and downs. Thank you for that. I have no statistics. But you also asked for anecdotal evidence, and I've got that in spades. :-) > Does patching make Fedora seem more approachable to end-users? Not this end-user. A little over 2 decades ago (good grief, am I really that old?) I installed Linux for the first time. It was Slackware. But then, in grad school, I was doing some cutting edge work that required cutting edge tools, and Slackware didn't have new enough versions of the tools I needed. I went looking for a new distribution, and everybody said that if I wanted the latest stuff, I should install RedHat, so I did. That was RedHat 4.2, and I updated to every release of RedHat thereafter, and followed up by migrating to Fedora when that transition took place. Every version of Fedora since then has run on at least one machine under my care, and often multiple machines. I came to RedHat/Fedora in the first place because it had the latest shiny stuff. That's the draw of Fedora for me. > Do the benefits of batching outweigh the downsides? Not for me, they don't. I'm willing to believe that they do for some users, but I frankly don't know which users that would be. Perhaps those with limited bandwidth? > Should we keep batching as an interim measure until delta downloads are > implemented? I'm indifferent on this one. I find batching to be more annoying than otherwise, but if it helps with server load and mirroring then I guess I can't complain too loudly. > Should dnf offer smart batched updates like gnome-software? That seems reasonable, as long as there is a way to override it. > Should we encourage maintainers to allow their updates to be batched? The answer to this question must depend on the answers to the previous questions. I'll give you my personal experience again. I've been letting my updates go out with the batched updates. I don't like it, though. The packages I tend get karma very, very rarely. Almost always, the only testing those packages get prior to going out to the wide world is the testing I do before running fedpkg build. So they already have to pointlessly sit in testing for 7 days, getting no actual testing, and then they have to wait up to another week to go out. If I'm fixing actual bugs, then all that accomplished was to increase the probability that yet another user will run afoul of the bug that I already fixed. Bottom line: I don't like batching as either an end-user or as a package maintainer, but I'm willing to put up with it if that is what the project as a whole needs. Regards, -- Jerry James http://www.jamezone.org/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org