Hi,
So here we are with the details .
Version : Linux 2.0.32
libg++-2.7.2.8
htdig3.0.8b2
Here is my config file :
database_dir: /disk1/htdig/db
start_url: http://www.cyclone.pf/
limit_urls_to: ${start_url}
exclude_urls: /cgi-bin/ .cgi
max_head_length: 10000
search_algorithm: exact:1
Here is the result of "htdig -i -v -v -v -v -v -s" :
New server: www.cyclone.pf, 80
Retrieval command for http://www.cyclone.pf/robots.txt: GET /robots.txt
HTTP/1.0
User-Agent: htdig/3.0.8b2 ([EMAIL PROTECTED])
Host: www.cyclone.pf
Header line: HTTP/1.1 302 Moved Temporarily
Header line: Date: Tue, 26 May 1998 18:40:53 GMT
Header line: Server: Apache/1.2.4
Header line: Location: http://www.mana-online.pf/error.html
Header line: Connection: close
Header line: Content-Type: text/html
Header line:
returnStatus = 3
pick: www.cyclone.pf:80, # servers = 1
0:0:0:http://www.cyclone.pf/: Retrieval command for
http://www.cyclone.pf/: GET / HTTP/1.0
User-Agent: htdig/3.0.8b2 ([EMAIL PROTECTED])
Host: www.cyclone.pf
Header line: HTTP/1.1 200 OK
Header line: Date: Tue, 26 May 1998 18:40:53 GMT
Header line: Server: Apache/1.2.4
Header line: Last-Modified: Thu, 15 Jan 1998 19:02:26 GMT
Translated Thu, 15 Jan 1998 19:02:26 GMT to Thu, 15 Jan 1998 19:02:26
(98)
And converted to Thu, 15 Jan 1998 19:02:26
Header line: ETag: "1002-2c1-34be5d42"
Header line: Content-Length: 705
Header line: Accept-Ranges: bytes
Header line: Connection: close
Header line: Content-Type: text/html
Header line:
returnStatus = 0
Read 705 from document
Read a total of 705 bytes
Tag: HTML>, matched -1
Tag: HEAD>, matched -1
Tag: META NAME="creator" CONTENT="[EMAIL PROTECTED]">, matched 20
And it get stuck here .
Any Ideas ?
Mario Baetz wrote:
>
> Hi,
>
> could You give more details about Your problems otherwise
> one has to index all these sites to know what's going wrong.
>
> Mario
>
> Gilles Lorphelin wrote:
>
> > Hi
> >
> > I Just try ht://dig ,
> > It compile well , run well with site : htdig.sdsu.edu
> >
> > But get stuck while parsing my sites :
> > www.cyclone.pf
> > www.mana.pf
> > www.surf.pf
> > www.imagin.pf
> >
> > And seems to work with some sites :
> > www.yahoo.com
> > www.whitehouse.gov
> >
> > But not with : www.france98.com
> > www.fnac.fr
> >
> > does anyone experience the same trouble ?
> >
> > Does anyone know why I can't index my sites ?
> > Is it due to any wrong HTML programming ?
> > Is it due to my apache server ?
> >
--
Gilles Lorphelin
Telecoms Mgr. - ISOC Member
Phone : +689 508 888
MANA S.A. (www.mana.pf) Fax : +689 508 889
IAP/ISP - Tahiti & her Islands E-mail: [EMAIL PROTECTED]
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.