Re: Thoughts / Ramblings

Barbie Thu, 04 Sep 2008 03:37:10 -0700

On Wed, Sep 03, 2008 at 10:39:07PM -0400, David Golden wrote:
> On Wed, Sep 3, 2008 at 7:18 PM, Barbie <[EMAIL PROTECTED]> wrote:
> 
> > Another irritation for authors is to receive reports for distributions
> > that have since been fixed. This has been much more of a problem in
> > recent times as many of the testers have recently built new environment
> > and are testing all of CPAN. I don't see the value of sending authors
> > reports for distributions that have since got a PASS report. This is
> 
> If you mean don't test 1.01 when 1.02 has a PASS, then I agree.


Exactly. This was how CPAN-YACSmoke was designed to work from the start.
It can test earlier versions, but by default it tries to find the most
recent distribution that will PASS, reporting as appropriate for the
later releases.

> My
> approach with CPAN::Reporter::Smoker is to test only the latest
> release -- even if "latest" was 5+ years ago.  I think we should
> discourage testing older distributions entirely.

If it's on CPAN, then perhaps it's there for a reason. In some cases it
there simply because the author never deletes their back catalogue. In
those cases the additional flood of FAILs has helped to clean up CPAN a
bit, but that is an edge use case ;)

I would rather we keep to the process of testing from the most recent
version back to the first version that PASSes. It's nice to have that
historical data from a statistical analysis perspective, but for authors
and users it serves no real benefit.

> Note that this is potentially something that we could "enforce" by
> checking if a distribution is no longer the most current and
> discarding reports that refer to older releases.

This is what I was getting at with the with the way CPANPLUS looks at
the YAML file.

> > We have all pretty much agreed that having a central alerting system,
> > that an author can set preferences against, is a good thing.
> 
> I'd rather see a smart central remailer that watches the incoming
> reports and emails an author about the *first* FAIL for a
> distribution-platform-perl tuple.  The rest can go into the stats, but
> we can spare the author more than one FAIL report per platform per
> release.

I've had a thought about that on the drive in this morning. At the
moment we do have a central place where all the reports are parsed, the
CPAN Testers database. This is parsed currently once a day, but my plan
was to try and move this to perhaps 4 times a day once all the current
work was done. Intergrating a mailer at this point would probably be a
good prototype for the full preferences system.

I already have tools that prepare mails for authors who upload badly
formatted distributions, and for testers to highlight reports that are
badly formatted, they're just not fully automated yet. It shouldn't be
difficult to adapt them to send authors mails for FAILs.

However, I think I can go even better than that, if my brainwave this
morning isn't faulty. Doing the mailer after the database gets updated,
means that rather than sending the author a mail for every report, with
the contents of the report, we can send them an aggregated mail of all
the FAILs for all their distributions since the last update. But instead
of including the report content, we simply include a link to the current
NNTP copy. This can later be updated to allow them to retrieve the
report for the CT2.0 storage.

The upshot is that instead of a flood they get 1 mail, with everything
in. They then have the option to review it or junk it.

I'll see what I can put together over the next few days.

This also ties in with something else that I have been planning. The
current CPAN Testers articles database is fast approaching 20GB, which
is rather large for the VM that the sites run off at the moment. As such
I have been looking at rehosting it all. I'd rather not say too much, as
nothing has been agreed yet, but potentially a 2x100GB RAID high
performance server may become available. The intention would be to host
all the CT stuff on it. Including the CT2.0 HTTP APIs :)

> However, I could add the CC surpression logic directly into
> Test::Reporter

Seems sensible, and the YAML did seem to work well when there were only
20 or so testers. With the current daily update, authors may well still
get a flood, hence why I want to look at doing a more frequent update,
and another reason why a new high performance server will be very
welcome :)

> I still think we should kill all CC'ing at the Test::Reporter level
> and set up a central remailer.  It's crude compared to what we want
> for 2.0, but it'll be easier to improve and fix over time than having
> to get everyone to upgrade their clients yet again.

It'll be a prototype ;)

> I understand how you can be drained -- that's a lot of work without
> (yet) many kudos.

The thing for me isn't the kudos. But I do know that others appreciate
it. For me I just want to get it done. I'm happy to receive constructive
criticism, and fresh ideas and inspiration, that's what drives me.
Perhaps that's why I take unconstructive criticism so badly :(

> But I really do think that improving the reporting
> and stats will be seen as a positive and if we can quickly follow up
> with some triage to the false alarms, then I think we'll get some
> credit for our responsiveness and zeal.

Maybe. I'll see how I feel over the weekend.

Cheers,
Barbie.
-- 
Birmingham Perl Mongers <http://birmingham.pm.org>
Memoirs Of A Roadie <http://barbie.missbarbell.co.uk>

Re: Thoughts / Ramblings

Reply via email to