On Wed, May 13, 2009 at 09:25:17AM -0400, David Golden wrote:

> My null hypothesis at the moment is
> 
>     ./history/$perlversion-$archname/history.db
>         PASS.db
>         FAIL.db
>         UNKNOWN.db
>         NA.db
> 
> Where each db is a sorted list of distfile name and associated GUIDS
> of that grade (including multiples if that is allowed):
> 
>     DAGOLDEN/File-Marker-0.13.tar.gz {GUID} {GUID} {GUID}
> 
> That would make checking for a duplicate report very fast -- binary
> search in the right grade file.

For binary search, you need to start either:
  seeking to approximate middle of range;
  rewind or fast-forward to an actual record boundary;
  read, wash, rinse, repeat
or have fixed-width fields.  But the gods like to punish people who
arbitrarily restrict their data thus.

Perhaps it makes sense to use a binary format like GDBM_File.  People
who need plain-text data for their shell scripts can trivially dump that
back out, and GDBM_File has always been in core.

GDBM_File doesn't, of course, let you store an array of GUIDs, but a
space-seperated list would probably do the job just fine.  If you really
need structured data, DBM::Deep is the way to go, at the expense of
adding a non-core module to the dependencies.  Of course, you could
still rename it to CPAN::Testers::DBM::Deep like I did with
Number::Phone to avoid "polluting" testers' machines with an unnecessary
module.

> Since getting smoker speedups depends on not retesting distributions
> with a known result

You still need to test (to find conflicts with other recently installed
modules) and install common dependencies every time if you test against
a reasonably clean perl install.  The only thing you can reliably skip
is generating and sending the report.

>                     optimizing for search seems to make sense for me.
> Writing a new result is slow due to the sort, but that's the
> tradeoff.

That has the disadvantage of really hammering a network if the logs are
kept on NFS (mine are on some of the machines I use, and until I moved
the perl I was testing with etc into /tmp the sysadmins got rather
annoyed at me; I can't really move the logs into /tmp (and back to $HOME
at the end of a session) as that way they're not shared between
instances running on different machines but sharing the same $HOME).

> >> I think it makes sense to allow the CT client config file to have
> >> "sections" for automated testing clients, but that change may take a while
> >> to happen (if it happens at all).
> > Not sure what you mean by this.
> In YAMLish-yadda-yadda terms:
>     global:
>         profile: myprofile.json
>         ...
>     CPAN::Reporter::Smoker
>         status_file: ~/smoking.txt
>         timeout: 3600
>         ...
>     POE::Component::BinGOs::Skynet::Smoker
>         queue_module: ...

Ah, OK.  That makes perfect sense.

-- 
David Cantrell | Bourgeois reactionary pig

Awww, people say the sweetest things:

18:40 <@danshell> DrHyde: you sick fuck

Reply via email to