On Jan 18, 2008 10:57 AM, Barbie <[EMAIL PROTECTED]> wrote:
> YACsmoke does indeed use SDBM_File, which is what Rob choose to use as
> he was familiar with it. I would be happy to look at a common DB for
> both YACSmoke and C::R, as I'd like to aim towards common dependencies.
>
> If DBM::Deep is the most appropriate, then possibly
> CPAN::Testers::DBM::Deep is needed :)

I suspect it's overkill, plus it means bugfixing a dead development
branch if we find stuff.

Overall, linear search through a file isn't too bad:

* "not tested" requires an exhaustive search anyway -- reading all
lines of the file; and this is the typical case
* some common dependencies will get searched repeatedly -- this could
probably be handled with a cache
* every test requires adding a record -- appending to a file is vastly
faster than adding to a sorted file or index

My current thinking -- given concerns about path length, inodes, etc
-- is to use the flat-file approach I use today, but to partition the
flat files by perl version and then author the same way that CPAN
does:

 logs/5.10.0/D/DA/DAGOLDEN.txt

At that level of granularity, the file will be a fraction of the size
of the full, merged log file -- hundreds of lines for the most
prolific authors (# dists x # releases tested x # platforms ) rather
than thousands or tens of thousands of lines.  So search times will be
cut dramatically for only a small increase in path-length.

So at the risk of premature optimization, that seems a reasonable
trade-off between speed from partitioning but portability and
robustness with a minimal demand on the file system.  The downside
would be searches across distributions -- e.g. all distributions that
had tests discarded -- but since there was a request for a merged flat
file as well to see where tests might have stopped, that's probably
the thing to search for more infrequent searches like discarded
distributions.

I suppose the proper thing to do would be to write this all up as
CPAN::Testers::Log -- and have CPAN::Reporter use that as a
dependency.  Then CPAN::YACSmoke could switch over at some point.
(I'm thinking of calling it "Log" rather than "DB" so that "DB" can be
used for the central repository.)

David

Reply via email to