On 2 October 2013 08:51, Peter Stuge <[email protected]> wrote:

> I agree, but I think the problem is basically that many people
> consider it impossible for "newer" to ever be shitty.
>
> Even if they are intimately familiar with the details of a package
> upstream they may still not be capable of determining what is shitty
> in the context of a distribution.
>
> I guess most stabilization requesters as well as actual stabilizers
> are even less familiar with upstream details, so determining what is
> shitty and not is really hard.. :\
>


The other really annoying thing you have to consider, is that most people
out there are using all (~ARCH) or all (ARCH) keywording, not a mix of the
two.

^ This has the fun side effect of meaning packages that are (~ARCH) and
have (ARCH) dependents, where the package that is currently (~ARCH) is
pending stabilization,  has likely had nobody test it at all except for
arch testers.

So if you're relying on the presence of filed bugs to give some sort of
coverage metric, you're going to be out of luck from time to time. For
instance, that fun bug where stabilising a version of libraw broke the
things depending upon it that were already stable.

Its ironic really, thats essentially a hidden bug that exists as a result
of having two tiers of testing.

https://bugs.gentoo.org/show_bug.cgi?id=458516

I've been long wishing there was a FEATURE I could turn on that would just
report installs to a database somewhere, showing
success/fail/succcess+tests/fail+tests , with dependencies, useflags, and
other relevant context, so you'd at least have a dataset of *success*
rates, and you could do more heavy testing on things where there were fail
results, or an absense of results.

CPAN has a comparable service that leverages end users reporting test runs
on code while they install it, and some end users, given this power, go so
out and set up mass automated testing boxes, the utility of which I find
useful on a daily basis, because a user is far more likely to get an
automated success/fail report sent to a server, and far *less* likely to
want to set time aside to go through the rigmarole of filing a bug report.

Some users are also inclined to just try a few versions either side, and
never file a bug report, or try twiddling USE flags or disabling
problematic FEATURES to find a version that works for them, and you may
never see a bug report for that.

An automated "X combination failed" report at very least lets you know a
datapoint where a failure occurred.

I'm not saying we should make any automated decision making *based* on
those reports, but it would be far more useful to have a list of known
failure cases up front than to ask a tester to try be lucky and find it by
looking for it.

( After all, bugs often arise when you're not looking for them, as opposed
to when you are, and some bugs, when looked for, vanish )

-- 
Kent

Reply via email to