Daryl C. W. O'Shea writes:
> I think I've got this fixed when not running in --cs_paths_only mode.  I 
> couldn't break it or cause it to hang/loop in a couple quick tests.

yay ;)

> >> What's causing the messages to disappear during the mass-check run?
> > 
> > probably the corpus being updated via rsync.  it's a very big corpus.
> 
> To avoid this in a probably nearly identical setup I "tag" the corpus by 
> making a linked duplicate of it for that particular mass-check run and 
> then delete the linked copy when the server exits.  "cp -al" is your friend.

This may be a good option.  I'd prefer if mass-check was just resilient,
though. ;)

The zone's nightly-mc corpus (uploaded corpora) are this big (in KB):

  2       /export/home/bbmass/rawcor/doc
  19760   /export/home/bbmass/rawcor/fredt
  6764040 /export/home/bbmass/rawcor/jm (mostly spam, since May 2007)
  209393  /export/home/bbmass/rawcor/zmi

so that's pretty big.  In terms of disk space usage, that probably
wouldn't take much space to cp -al; but it'd take a fair bit of time,
esp on the zone, which has serious I/O bottleneck problems.

> As an aside, if bandwidth is free, the whole mass-check will run quite a 
> bit faster if you rsync the corpus to each of the slaves.  Of course 
> that assumes you've got the disk space and i/o to spare (i/o you may 
> already have if /tmp isn't a ramdisk).

yeah, rsyncing about 7GB of corpora, nightly, would definitely be slow ;)

--j.

Reply via email to