On 27 Apr, Peter L. Peres wrote:
> 
> Hi, 
> 
> the machine finished ! The loop was in the java api docs. There were no
> other loops. There is no bug in htdig wrt. this problem (looping). 
> 
> Here are some stats from the end:
> 
> 27425.60user 10781.29system 43:11:27elapsed 24%CPU (0avgtext+0avgdata
> 0maxrent)k
> 0inputs+0outputs (18429778major+3453038minor)pagefaults 2501532swaps
> 
> htdig ran with a niceness 18 for the last 25% of the indexing. Load was
> 0.8 or so during this time. docdb is about 200MB. My input was about
> 220MB. 
> 
> The loop problem was in the tree:
> 
> /usr/doc/packages/javadoc/docs/api/
> 
> which has more than 500 entries. 
> 
> System: i486/100MHz/24MB RAM 4.3+2.8 GB EIDE disks (not UDMA), headless
> (ethernet only) Suse 6.2 Linux (w. modified html documentation system - by
> me). As you can see the machine was swapping like crazy. I think I'd need
> a machine with 256MB RAM to avoid serious swapping. Not likely anytime
> soon. 
> 
> thank you all for the ideas,
> 
>       Peter

I think I've come across this sort of problem when trying to index a
series of documents that have a lot of internal references (A
HREF="#target"> and htdig tries to follow each of these links, ending up
going in ever decreasing circles until....

My solution was to add something like html# to the exclude_urls list.

Cheers
-- 
David Robley                        | WEBMASTER & Mail List Admin
RESEARCH CENTRE FOR INJURY STUDIES  | http://www.nisu.flinders.edu.au/
AusEinet                            | http://auseinet.flinders.edu.au/
            Flinders University, ADELAIDE, SOUTH AUSTRALIA


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to