Gilles Lorphelin wrote:
>
> Hi
>
> I Just try ht://dig ,
> It compile well , run well with site : htdig.sdsu.edu
>
> But get stuck while parsing my sites :
> www.cyclone.pf
> www.mana.pf
> www.surf.pf
> www.imagin.pf
>
> And seems to work with some sites :
> www.yahoo.com
> www.whitehouse.gov
>
> But not with : www.france98.com
> www.fnac.fr
>
> does anyone experience the same trouble ?
>
> Does anyone know why I can't index my sites ?
> Is it due to any wrong HTML programming ?
> Is it due to my apache server ?
>
> Please Help me !
As a test, I ran htdig with the following config file:
database_dir: /tmp/db
start_url: http://www.surf.pf/ http://www.cyclone.pf/
http://www.imagin.pf/
limit_urls_to: ${start_url}
exclude_urls: /cgi-bin/ .cgi
max_head_length: 10000
max_doc_size: 5000000
search_algorithm: exact:1
It finished the index without problems:
htdig: Run complete
htdig: 3 servers seen:
htdig: www.cyclone.pf:80 18 documents
htdig: www.imagin.pf:80 7 documents
htdig: www.surf.pf:80 50 documents
So I don't know why you can't index those same sites.
(I used ht://Dig 3.0.8b2)
--
Andrew Scherpbier <[EMAIL PROTECTED]>
Contigo Software <http://www.contigo.com/>
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.