The portions of my site I desire to index are mailing list archives; about 128
months worth, growing at a rate of ten per month.  I've written a few scripts
to handle digging and merging the site in month-size blocks.

Digging goes fine.  I can dig everything, and dump each successive month's
worth of dug results into a unique folder in a staging area.  Then the merge
scripts kick in, attempting to serially merge the contents of every directory
in the staging area with the master index.  This works for a while, and then
it claims to run out of disk space.  I get these results from htmerge:

/: write failed, file system is full
/usr/bin/sort: write error: No space left on device
htmerge: Word sort failed

Once I've seen one of these errors, subsequent merge attempts will fail as
well, until I delete them and start over -- but as near as I can tell, disk
space is not the problem.  I can create files cheerfully on the volume in
question; df shows it to be at 59% capacity with a few hundred MB free.  

My ISP, which hosts the freebsd server I'm running on, is stonewalling me, but
the only answer I was reasonably hoping to get from them is, "yes, there's a
problem with that disk" and if it were that simple I think I'd have gotten an
answer.  Their only other advice has been to use multiple small databases,
which I suppose means they're thinking of some other search engine.

Do phantom out-of-disk-space errors sound familiar to anyone?  Is there a
characteristic of htmerge which might cause it to think it was running out of
disk space (such as requiring a fantastic amount of temporary disk space)? 
I've had the problem kick in at index sizes from 40MB to about 90MB; I expect
the full index when complete will be about 200-250MB.  I'm using htdig 3.1.2.

Thanks!

  -nat

-- 
Nat Irons                                          [EMAIL PROTECTED]
Apple Computer, WWDR &c                               408-974-0915
PGP fingerprint: 873D 7978 23FC 37FE 10D5 349A F57F 0FAA F4D4 B19A
"The land we belong to, belongs to the bank."  -  Garrison Keillor


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to