My test run with 3.1.6 has finally ended with a sig 6 exit from htdig
and a file full from the sort routine. The 1.5 GB space was not enough
during the sort step.

However, I did learn some things besides the shortage of disk space. 
Htdig is very cpu limited when the file size gets large. It appears that
some of this may be due to walking lengthy chains (IMHO). 

When htdig begins a new site, the network lights are busy almost
continuously and cpu utilization is under 10%. As the site is processed,
the network stays busy and the cpu utilization slowly rises. When a
given site internal data plus code reaches about 80 MB, the network gets
short (1-3 sec) bursts separated by long (10 sec or more) of 90-100%
busy cpu.

Also of interest is the size of the code plus data increases steadily,
then reduces just a bit at the end of a site, getting quite large over
time. Here are the final numbers for this htdig test run before sort:

Swap  Elap  CPU   db.doc  db.word  SIZE   RES  
 MB   Time  Time   MB        MB     MB     MB
 270  61:30 38.4H  379      341     270    83

I'll be adding some disk space and memory if possible and test the
process again when I can. 

If my deduction of the long chains is correct, this is a major cause of
long run times on large sites. At this point I don't know exactly what
code is executing then. 

BillN
begin:vcard 
n:Nicholls;Bill 
x-mozilla-html:FALSE
url:http://www.billswrite.com
adr:;;;;;;
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;-19536
fn:Bill Nicholls
end:vcard

Reply via email to