Again, me stupid,
After tracing the code, it seems that htdig is allright.
The pages it was indexing are HTML versions of the java
API documentation and these have *a lot* of <A NAME=...>
tags in them.
So htdig needs *a lot* of time to go trough lists etc.
notably around line 346 of htcommon/DocumentRef.cc
addlist(DOC_ANCHORS, s, docAnchors);
Which brings me to a question:
Is there really a usefull function performed by these tags
(for use in a search engine, that is)
> Hmph. Sounds like there are some bugs to squash in the connection
> code. Can you find the connection for that particular document in the
> server log? Was the server heavily loaded at that point?
>
> Gabriele and I are in the middle of a higher-level rewrite
> (HtHTTP/Transport), but perhaps we want to revisit all the networking
> code. Loic's suggestion on a test suite would help, but I'd be at a
> bit of a loss for the base cases. Would we need to write/copy a TCP
> sniffer, or am I missing something?
>
> Any suggestions? Should we break the networking code out into a
> separate shared library (htnet)?
--jesse
--------------------------------------------------------------------
J. op den Brouw Johanna Westerdijkplein 75
Haagse Hogeschool 2521 EN DEN HAAG
Sector Techniek Netherlands
Afdeling Elektrotechniek +31 70 4458936
-------------------- [EMAIL PROTECTED] --------------------
Linux - because reboots are for hardware changes
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.