On Mon, 12 Apr 1999, Ying Zhang wrote:
>Hi
>
>I recently installed ht://Dig (3.1.1) on my site, but it won't index the
>site. When I try to index other sites, it does work.
>
>I think it may have something to do with the pages on my site -- they are
>generated dynamically with PHP whereas the sites that do work are static
>html pages. To confirm this, I created a directory with some pages inside
>that are plain html files and they do get index. Has anyone had similar
>problems?
>
>This is what I get runnint htdig -vvv:
>
>Warning: unknown locale!
Your should set a proper locale value in your htdig.conf.
> 1:0:http://dcfonline.sfu.ca/
>New server: dcfonline.sfu.ca, 80
>Retrieval command for http://dcfonline.sfu.ca/robots.txt: GET /robots.txt
ht://Dig follows the robots exclusion standard. Create a robots.txt
file in your server root directory (it is also required by most www
search engines, which won't index your site correctly if this file
is not present).
>HTTP/1.0
>User-Agent: htdig/3.1.1 ([EMAIL PROTECTED])
>Host: dcfonline.sfu.ca
>
>Header line: HTTP/1.1 404 Not Found
>Header line: Date: Mon, 12 Apr 1999 04:39:47 GMT
>Header line: Server: Apache/1.3.4 (Unix) PHP/3.0.7
>Header line: Connection: close
>Header line: Content-Type: text/html
>Header line:
>returnStatus = 1
> pushed
>pick: dcfonline.sfu.ca, # servers = 1
>0:0:0:http://dcfonline.sfu.ca/: Retrieval command for
>http://dcfonline.sfu.ca/: GET / HTTP/1.0
>User-Agent: htdig/3.1.1 ([EMAIL PROTECTED])
>Host: dcfonline.sfu.ca
>
>Header line: HTTP/1.1 200 OK
>Header line: Date: Mon, 12 Apr 1999 04:39:47 GMT
>Header line: Server: Apache/1.3.4 (Unix) PHP/3.0.7
>Header line: Connection: close
>Header line: Content-Type: text/html
>Header line:
>returnStatus = 0
>Read 4081 from document
>Read a total of 4081 bytes
> size = 4081
>pick: dcfonline.sfu.ca, # servers = 1
>
>Kind of strange, because looking at the source of that page through a
>browser it looks perfectly normal.. Hope someone can help.
I bet you also do not have a "robots" meta tag in your document
which would be used by ht://Dig in case a robots.txt is not available.
>
>Ying
hth,
Torsten
--
InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstra�e 14 Tel: +49-4101-403605
D-25474 Ellerbek Fax: +49-4101-403606
E-Mail: [EMAIL PROTECTED] Internet: http://www.inwise.de
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.