David Golden wrote:
On Jan 18, 2008 8:09 AM, David Cantrell <[EMAIL PROTECTED]> wrote:
I would like a plain-text log alongside the indexed one - it's easier to
look at from shell scripts so I can do things like compare the log to
the list of distributions I'm testing and see where a smoker broke.
Out of curiosity, do you clear out your log file periodically?  If
you're really up to ~80K reports, I've got to think that your log
files are getting huge.

My biggest one is 2MB, recording all the reports I've sent from Linux. I don't clear them out. If something is a common pre-requisite but always fails on 5.6.2, like WWW::Mechanize does, then I really shouldn't clear that out of my logs and hassle Andy again about a version that I've already tested. He already knows about the test failure.

If they do ever get too big, then I suppose I could write a little script to strip out anything that's been superceded on the CPAN.

It might also be useful to have OS (and version) and hostname - the
former to cope with OS upgrades on a machine, which would make sending
the reports again a legitimate thing to do, the latter for the case
where a home directory is mounted over NFS and shared between several
smoke boxes.
The OS/version are part of the "unique" characteristics of a report
already so those have to go in.  Hostname seems a bit more like
overkill.  I mean, if you test Foo-Bar-1.23 on one machine, do you
really want to be testing it again on the same perl/arch/os but just a
different hostname?

I suppose not.  Having the OS/version is probably sufficient.

Filename length limits. Case-sensitivity.  Consumption of vast numbers
of inodes.  That last one is a killer.  If we have 30,000 test reports
in the database, each with some combination of:
  author/dist/version/perl/epoch/grade/platform/hostname
then that's [tappity-tap] 240,000 inodes.
Inodes.  Right.  Ick.  I'm not sure I buy the math, but inode
consumption could be relevant -- particularly given the number of
reports being submitted by the leaders.

Thinking about it, I don't buy the maths either :-) You could keep the number down by carefully ordering the components in the path to restrict the number of directories created - keep that which varies the least at the beginning, like architecture, perl version, grade, and that which varies the most - distribution and epoch - at the end.

An inode is consumed for every file and every directory.

                         I think CPAN::YACSmoke uses SDBM_File, but
from MJD's presentation on lightweight databases, it looks like it
might have issues as the number of keys gets into the thousands.
   http://perl.plover.com/yak/lightweight-db/materials/slides/slide077.html

I don't know if DBM::Deep has similar issues.

It seems to cope with Number::Phone::UK::DetailedLocations OK, which has about a quarter of a million records in a __DATA__ section.

--
David Cantrell | Minister for Arbitrary Justice

Computer Science is about lofty design goals and careful algorithmic
optimisation.  Sysadminning is about cleaning up the resulting mess.

Reply via email to