Re: to batch or not to batch?

2018-02-20 Thread Zbigniew Jędrzejewski-Szmek
On Mon, Feb 19, 2018 at 04:02:56PM +0100, Kalev Lember wrote:
> On 02/19/2018 03:40 PM, Randy Barlow wrote:
> > On 02/19/2018 06:34 AM, Vít Ondruch wrote:
> >> But anyway, why don't we
> >> have "updates-testing", "updates-batched" and "updates" repositories?
> >> "updates-batched" could be enabled by default while "updates" could be
> >> enabled manually if one wishes.
> > 
> > When we talked about this in the past I believe resources were the
> > concern, both on the Fedora infrastructure side and on the mirroring side.
> 
> I think a new, additional repo with batched contents should solve a lot
> of problems here: we'd have both side of the fence happy (regular users:
> batched updates, developers: continuous stream of updates), and it would
> also make it possible to actually QA the batched set before it hits the
> mirrors.
Those are good points. I'd also add that since this removes the reason
why developers were overriding batched to push to stable, hopefully we
would see more packages go to batched.

> Right now there's no way for QA to only extract batched updates without
> getting all the rest of updates-testing; if we had that we could
> actually have people test the batched set of updates before they are
> pushed out to stable.
That answer another doubt that people had.

Zbyszek

> P.S. Those who were at the Atomic Workstation call that ended a few
> minutes ago, I believe this is what we need to solve the split updates
> problem that we talked about.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: to batch or not to batch?

2018-02-19 Thread Kalev Lember
On 02/19/2018 03:40 PM, Randy Barlow wrote:
> On 02/19/2018 06:34 AM, Vít Ondruch wrote:
>> But anyway, why don't we
>> have "updates-testing", "updates-batched" and "updates" repositories?
>> "updates-batched" could be enabled by default while "updates" could be
>> enabled manually if one wishes.
> 
> When we talked about this in the past I believe resources were the
> concern, both on the Fedora infrastructure side and on the mirroring side.

I think a new, additional repo with batched contents should solve a lot
of problems here: we'd have both side of the fence happy (regular users:
batched updates, developers: continuous stream of updates), and it would
also make it possible to actually QA the batched set before it hits the
mirrors.

Right now there's no way for QA to only extract batched updates without
getting all the rest of updates-testing; if we had that we could
actually have people test the batched set of updates before they are
pushed out to stable.

P.S. Those who were at the Atomic Workstation call that ended a few
minutes ago, I believe this is what we need to solve the split updates
problem that we talked about.

-- 
Kalev
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: to batch or not to batch?

2018-02-19 Thread Randy Barlow
On 02/19/2018 06:34 AM, Vít Ondruch wrote:
> But anyway, why don't we
> have "updates-testing", "updates-batched" and "updates" repositories?
> "updates-batched" could be enabled by default while "updates" could be
> enabled manually if one wishes.

When we talked about this in the past I believe resources were the
concern, both on the Fedora infrastructure side and on the mirroring side.



signature.asc
Description: OpenPGP digital signature
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: to batch or not to batch?

2018-02-19 Thread Vít Ondruch
I am using Rawhide, so I personally don't care. But anyway, why don't we
have "updates-testing", "updates-batched" and "updates" repositories?
"updates-batched" could be enabled by default while "updates" could be
enabled manually if one wishes.


Vít


Dne 17.2.2018 v 23:15 Zbigniew Jędrzejewski-Szmek napsal(a):
> Bodhi currently provides "batched updates" [1] which lump updates of
> packages that are not marked urgent into a single batch, released once
> per week. This means that after an update has graduated from testing,
> it may be delayed up to a week before it becomes available to users.
>
> Batching is now the default, but maintainers can push theirs updates
> to stable, overriding this default, and make the update available the
> next day.
>
> Batching is liked by some maintainers, but hated by others
> Unfortunately, the positive effects of batching are strongly
> decreased when many packages are not batched. Thus, we should settle
> on a single policy — either batch as much as possible, or turn
> batching off. Having the middle ground of some batching is not very
> effective and still annoys people who don't like batching.
>
> To summarize the ups (+) and downs (-):
>
> + batching reduces the number of times repository metadata is updated.
>   Each metadata update results in dnf downloading about 20-40 mb,
>   which is expensive and/or slow for users with low bandwidth.
>
> + a constant stream of metadata updates also puts strain on our mirrors.
>
> + a constant stream of updates feels overwhelming to users, and a
>   predictable once-per-week batch is perceived as easier. In
>   particular corporate users might adapt to this and use it to
>   schedule an update of all machines at fixed times.
>
> + a batch of updates may be tested as one, and, at least in principle,
>   if users then install this batch as one, QA that was done on the
>   batch matches the user systems more closely, compared to QA testing
>   package updates one by one as they come in, and users updating them
>   at a slightly different schedule.
>
> - batching delays updates of packages between 0 and 7 days after
>   they have reached karma and makes it hard for people to immediately
>   install updates when they graduate from testing.
>
> - some users (or maybe it's just maintainers?) actually prefer a
>   constant stream of small updates, and find it easier to read
>   changelogs and pinpoint regressions, etc. a few packages at a time.
>
> - batching (when done on the "server" side) interferes with clients
>   applying their own batching policy. This has two aspects:
>   clients might want to pick a different day of the week or an
>   altogether different schedule,
>   clients might want to pick a different policy of updates, e.g. to
>   allow any updates for specific packages to go through, etc.
>
>   In particular gnome-software implements its own style of batching, where
>   it will suggest an update only once per week, unless there are security
>   updates.
>
> Unfortunately there isn't much data on the effects of batching.
> Kevin posted some [2], as did the other Kevin [3] ;), but we certainly
> could use more detailed stats.
>
> One of the positive aspects of batching — reduction in metadata downloads,
> might be obsoleted by improving download efficiency through delta downloads.
> A proof-of-concept has been implemented [4].
>
> Second positive aspect of batching — doing updates in batches at a
> fixed schedule, may just as well be implemented on the client side,
> although that does not recreate the testing on the whole batch, since
> now every client it doing it at a different time. It's not clear though
> if this additional testing is actually useful.
>
> There's an open FESCo ticket to "adjust/drop/document" batching [5].
> That discussion has not been effective, because this issue has many
> aspects, and depending on priorities, the view on batching is likely to
> be different. FESCo is trying to gather more data and get a better
> understanding of what maintainers consider more important.
>
> Did I miss something on the plus or minus side? Or some good statistics?
> Does patching make Fedora seem more approachable to end-users?
> (this is a question in particular for Matthew Miller who pushed for batching.)
> Do the benefits of batching outweigh the downsides?
> Should we keep batching as an interim measure until delta downloads are 
> implemented?
> Should dnf offer smart batched updates like gnome-software?
> Should we encourage maintainers to allow their updates to be batched?
>
> [1] https://github.com/fedora-infra/bodhi/issues/1157,
> 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/UDXVXLT7JXCY6N7NRACN4GBS3KA6D4M6/
> [2] 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/B6MMH3L36A2YXQ45Y4DUGMR4XIG7QKE5/
> [3] 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/F36YMWKDXBHAQWQOLDSYLYTMDF4WYHE6/

Re: to batch or not to batch?

2018-02-18 Thread Zbigniew Jędrzejewski-Szmek
On Sun, Feb 18, 2018 at 11:40:47AM +0100, Fabio Valentini wrote:
> On Sat, Feb 17, 2018 at 11:15 PM, Zbigniew Jędrzejewski-Szmek
>  wrote:
> > Bodhi currently provides "batched updates" [1] which lump updates of
> > packages that are not marked urgent into a single batch, released once
> > per week. This means that after an update has graduated from testing,
> > it may be delayed up to a week before it becomes available to users.
> >
> > Batching is now the default, but maintainers can push theirs updates
> > to stable, overriding this default, and make the update available the
> > next day.
> >
> > Batching is liked by some maintainers, but hated by others
> > Unfortunately, the positive effects of batching are strongly
> > decreased when many packages are not batched. Thus, we should settle
> > on a single policy — either batch as much as possible, or turn
> > batching off. Having the middle ground of some batching is not very
> > effective and still annoys people who don't like batching.
> 
> (snip)
> 
> > To summarize the ups (+) and downs (-):
> >
> > + batching reduces the number of times repository metadata is updated.
> >   Each metadata update results in dnf downloading about 20-40 mb,
> >   which is expensive and/or slow for users with low bandwidth.
> 
> This savings effect is negligible, because metadata has to be updated
> even if only 1 urgent security update is pushed to stable.

[FTR, it's any urgent update, security or not.]
Yes, but we don't have urgent updates every day. Even if we have them
every other day, that'd still be 50% reduction in metadata downloads.

> > + a constant stream of metadata updates also puts strain on our mirrors.
> >
> > + a constant stream of updates feels overwhelming to users, and a
> >   predictable once-per-week batch is perceived as easier. In
> >   particular corporate users might adapt to this and use it to
> >   schedule an update of all machines at fixed times.
> 
> I'd rather want to see a small batch of updates more frequently than a
> large batch that I won't care to read through.

Yes, but I think you are in the minority. I'm pretty sure most users
don't bother reading descriptions. Of course it's hard to gauge this,
but I'd expect that with the frequency of updates in Fedora only very
dedicated admins can look at every package and every changelog. Most
people install the whole set and investigate only if something goes
wrong.

> > + a batch of updates may be tested as one, and, at least in principle,
> >   if users then install this batch as one, QA that was done on the
> >   batch matches the user systems more closely, compared to QA testing
> >   package updates one by one as they come in, and users updating them
> >   at a slightly different schedule.
> 
> Well, is any such testing of the "batched state" being done, and if it
> is, does it influence which packages get pushed to stable?

Sorry, I don't think we have any data on this. Maybe adamw and other
QA people can pitch in?
 
> > - batching delays updates of packages between 0 and 7 days after
> >   they have reached karma and makes it hard for people to immediately
> >   install updates when they graduate from testing.
> 
> This delay can be circumvented by maintainers by pushing directly to
> stable instead of batched (thereby rendering the batched state
> obsolete, however).

I meant that it is hard for *end-users*. Essentially, end users lose
the control of the timing, even though individual maintainers can still
control the timing of their updates.

> > - some users (or maybe it's just maintainers?) actually prefer a
> >   constant stream of small updates, and find it easier to read
> >   changelogs and pinpoint regressions, etc. a few packages at a time.
> 
> I certainly belong to this group.
> 
> > - batching (when done on the "server" side) interferes with clients
> >   applying their own batching policy. This has two aspects:
> >   clients might want to pick a different day of the week or an
> >   altogether different schedule,
> >   clients might want to pick a different policy of updates, e.g. to
> >   allow any updates for specific packages to go through, etc.
> >
> >   In particular gnome-software implements its own style of batching, where
> >   it will suggest an update only once per week, unless there are security
> >   updates.
> 
> Which further delays the distribution of stable updates by up to a
> week (depending on the schedule of gnome-software, I didn't check
> that). That makes a total of up to 3 weeks (!).
> 
> > Unfortunately there isn't much data on the effects of batching.
> > Kevin posted some [2], as did the other Kevin [3] ;), but we certainly
> > could use more detailed stats.
> >
> > One of the positive aspects of batching — reduction in metadata downloads,
> > might be obsoleted by improving download efficiency through delta downloads.
> > A proof-of-concept has been implemented [4].
> 
> A simpler approach might be to just flush all 

Re: to batch or not to batch?

2018-02-18 Thread Artur Iwicki
> Batching is now the default, but maintainers can push theirs updates
> to stable, overriding this default, and make the update available the
> next day.
I think that since "batch override" doesn't push the package immediately, but 
rather schedules it for the next day, I agree with Fabio that it might be a 
good idea to flush the "batch queue" when a package is explicitly pushed to 
stable by someone. This won't increase the number of metadata expirations - so 
there isn't really any drawback to end users - while allowing updates to reach 
users faster.
 
> + batching reduces the number of times repository metadata is updated.
>   Each metadata update results in dnf downloading about 20-40 mb,
>   which is expensive and/or slow for users with low bandwidth.
As someone with a rather small data cap, I'd say that heavy metadata downloads 
during "dnf update" are acceptable - since I can just choose to run "dnf 
update" only once a week or so. But it always irks me a bit when I want to 
install a new package and dnf starts downloading the repository metadata again. 
Bandwidth issues aside, it's just incredibly annoying having to wait for a 
40MiB download to complete before I can fetch a single 600KiB package.

> - batching delays updates of packages between 0 and 7 days after
>   they have reached karma and makes it hard for people to immediately
>   install updates when they graduate from testing.
I agree with Jerry here - many packages don't get any karma while testing. The 
only time my packages received testing karma was when I was introducing new 
packages; didn't happen for updates. So having the package sit in limbo for 
another week after going through a week of "maybe someone'll take a look at 
this" is a bit discouraging.

> One of the positive aspects of batching — reduction in metadata downloads,
> might be obsoleted by improving download efficiency through delta downloads.
> A proof-of-concept has been implemented [4].
This could be a rather interesting feature, as it'd resolve some of the issues 
I wrote two paragraphs above.

By the way - does drpm handling depend on repo / mirror settings? I ask because 
I'm under the impression that lately hardly any package update on my system is 
done via delta-RPMs; it's about 1-in-100 or so. Is this more a matter of me 
needing to tweak dnf config, or can this depend on the package mirrors?

A.I.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: to batch or not to batch?

2018-02-18 Thread Ralf Corsepius

On 02/17/2018 11:15 PM, Zbigniew Jędrzejewski-Szmek wrote:

Bodhi currently provides "batched updates" [1] which lump updates of
packages that are not marked urgent into a single batch, released once
per week. This means that after an update has graduated from testing,
it may be delayed up to a week before it becomes available to users.

Batching is now the default, but maintainers can push theirs updates
to stable, overriding this default, and make the update available the
next day.

Batching is liked by some maintainers, but hated by others


As a maintainer, I hate them, because the primary effect "batched" has, 
is to furtherly increase update delay from 1 week up to 2 weeks or more 
(like in recent times).


That said, I consider "batched" to be superfluous bureaucracy.

Ralf


___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: to batch or not to batch?

2018-02-18 Thread Peter Oliver
> Did I miss something on the plus or minus side? 

+ Without batched updates, running “dnf update” gives a variable reward, like 
clicking refresh on Facebook or playing a fruit machine.  Accumulating updates 
into a batch gives a calmer, more ordered experience.

-- 
Peter Oliver
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: to batch or not to batch?

2018-02-18 Thread Fabio Valentini
On Sat, Feb 17, 2018 at 11:15 PM, Zbigniew Jędrzejewski-Szmek
 wrote:
> Bodhi currently provides "batched updates" [1] which lump updates of
> packages that are not marked urgent into a single batch, released once
> per week. This means that after an update has graduated from testing,
> it may be delayed up to a week before it becomes available to users.
>
> Batching is now the default, but maintainers can push theirs updates
> to stable, overriding this default, and make the update available the
> next day.
>
> Batching is liked by some maintainers, but hated by others
> Unfortunately, the positive effects of batching are strongly
> decreased when many packages are not batched. Thus, we should settle
> on a single policy — either batch as much as possible, or turn
> batching off. Having the middle ground of some batching is not very
> effective and still annoys people who don't like batching.

(snip)

> To summarize the ups (+) and downs (-):
>
> + batching reduces the number of times repository metadata is updated.
>   Each metadata update results in dnf downloading about 20-40 mb,
>   which is expensive and/or slow for users with low bandwidth.

This savings effect is negligible, because metadata has to be updated
even if only 1 urgent security update is pushed to stable.

> + a constant stream of metadata updates also puts strain on our mirrors.
>
> + a constant stream of updates feels overwhelming to users, and a
>   predictable once-per-week batch is perceived as easier. In
>   particular corporate users might adapt to this and use it to
>   schedule an update of all machines at fixed times.

I'd rather want to see a small batch of updates more frequently than a
large batch that I won't care to read through.

> + a batch of updates may be tested as one, and, at least in principle,
>   if users then install this batch as one, QA that was done on the
>   batch matches the user systems more closely, compared to QA testing
>   package updates one by one as they come in, and users updating them
>   at a slightly different schedule.

Well, is any such testing of the "batched state" being done, and if it
is, does it influence which packages get pushed to stable?

> - batching delays updates of packages between 0 and 7 days after
>   they have reached karma and makes it hard for people to immediately
>   install updates when they graduate from testing.

This delay can be circumvented by maintainers by pushing directly to
stable instead of batched (thereby rendering the batched state
obsolete, however).

> - some users (or maybe it's just maintainers?) actually prefer a
>   constant stream of small updates, and find it easier to read
>   changelogs and pinpoint regressions, etc. a few packages at a time.

I certainly belong to this group.

> - batching (when done on the "server" side) interferes with clients
>   applying their own batching policy. This has two aspects:
>   clients might want to pick a different day of the week or an
>   altogether different schedule,
>   clients might want to pick a different policy of updates, e.g. to
>   allow any updates for specific packages to go through, etc.
>
>   In particular gnome-software implements its own style of batching, where
>   it will suggest an update only once per week, unless there are security
>   updates.

Which further delays the distribution of stable updates by up to a
week (depending on the schedule of gnome-software, I didn't check
that). That makes a total of up to 3 weeks (!).

> Unfortunately there isn't much data on the effects of batching.
> Kevin posted some [2], as did the other Kevin [3] ;), but we certainly
> could use more detailed stats.
>
> One of the positive aspects of batching — reduction in metadata downloads,
> might be obsoleted by improving download efficiency through delta downloads.
> A proof-of-concept has been implemented [4].

A simpler approach might be to just flush all batched updates to
stable if there is at least one update (possibly an urgent security
update) anyway. That way, the metadata don't have to be downloaded for
just one update, and all packages reach stable sooner.

> Second positive aspect of batching — doing updates in batches at a
> fixed schedule, may just as well be implemented on the client side,
> although that does not recreate the testing on the whole batch, since
> now every client it doing it at a different time. It's not clear though
> if this additional testing is actually useful.

Well, the whole testing/installing batches of updates sounds a lot
like what Atomic Workstation is doing (which I really like).
However, forcing the same kind of process onto the current way of
doing things (with individual updates and packages) doesn't seem to
make anybody happy right now ...

> There's an open FESCo ticket to "adjust/drop/document" batching [5].
> That discussion has not been effective, because this issue has many
> aspects, and depending on priorities, the view on batching is 

Re: to batch or not to batch?

2018-02-17 Thread Jerry James
On Sat, Feb 17, 2018 at 3:15 PM, Zbigniew Jędrzejewski-Szmek
 wrote:
> Did I miss something on the plus or minus side? Or some good statistics?

I thought that was a pretty good summary of the ups and downs.  Thank
you for that.  I have no statistics.  But you also asked for anecdotal
evidence, and I've got that in spades. :-)

> Does patching make Fedora seem more approachable to end-users?

Not this end-user.  A little over 2 decades ago (good grief, am I
really that old?) I installed Linux for the first time.  It was
Slackware.  But then, in grad school, I was doing some cutting edge
work that required cutting edge tools, and Slackware didn't have new
enough versions of the tools I needed.  I went looking for a new
distribution, and everybody said that if I wanted the latest stuff, I
should install RedHat, so I did.  That was RedHat 4.2, and I updated
to every release of RedHat thereafter, and followed up by migrating to
Fedora when that transition took place.  Every version of Fedora since
then has run on at least one machine under my care, and often multiple
machines.

I came to RedHat/Fedora in the first place because it had the latest
shiny stuff.  That's the draw of Fedora for me.

> Do the benefits of batching outweigh the downsides?

Not for me, they don't.  I'm willing to believe that they do for some
users, but I frankly don't know which users that would be.  Perhaps
those with limited bandwidth?

> Should we keep batching as an interim measure until delta downloads are 
> implemented?

I'm indifferent on this one.  I find batching to be more annoying than
otherwise, but if it helps with server load and mirroring then I guess
I can't complain too loudly.

> Should dnf offer smart batched updates like gnome-software?

That seems reasonable, as long as there is a way to override it.

> Should we encourage maintainers to allow their updates to be batched?

The answer to this question must depend on the answers to the previous
questions.  I'll give you my personal experience again.  I've been
letting my updates go out with the batched updates.  I don't like it,
though.  The packages I tend get karma very, very rarely.  Almost
always, the only testing those packages get prior to going out to the
wide world is the testing I do before running fedpkg build.  So they
already have to pointlessly sit in testing for 7 days, getting no
actual testing, and then they have to wait up to another week to go
out.  If I'm fixing actual bugs, then all that accomplished was to
increase the probability that yet another user will run afoul of the
bug that I already fixed.

Bottom line: I don't like batching as either an end-user or as a
package maintainer, but I'm willing to put up with it if that is what
the project as a whole needs.

Regards,
-- 
Jerry James
http://www.jamezone.org/
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org