This is a provisional report, but it very much looks as though a single web
page consisting of a long list of numbers can set htdig 3.1.5 burning the
CPU for a period of several hours. My configuration file includes:
allow_numbers: yes
Indexing the page
http://www.maths.soton.ac.uk/postgraduate/students/Moxham/48.txt
appears to have taken over 24hours CPU time on a powerful SGI box, and htdig
is now number crunching the next similar file,
http://www.maths.soton.ac.uk/postgraduate/students/Moxham/spp2.txt
I am reporting this ASAP as it may account for some reports that htdig takes
days to complete, when for me it normally indexes ~60,000 documents from
scratch in less than 5 hours.
--
David Adams
Computing Services
Southampton University
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html