Hello,

my index process doesn't want to follow some links in a webpage.

the conf line is, for instance :
MaxHops 1000
MaxDocsPerServer 3500

Server http://www.webpage.com/as/



when I start index, it only displays
...
Loading configuration from /usr/local/aspseek/etc/aspseek.conf
( 0 1 1 0 0 0 0 2) Adding URL: http://www.webpage.com/as/robots.txt
( 0 1 1 0 0 0 0 2) Adding URL: http://www.webpage.com/as/index.html
Ended thread: 0. Start: 1039083134.396. End: 1039083134.663-1039083134.669. Duration: 0.267. URL: http://www.webpage.com/as/
Ended thread: 1. Start: 0.000. End: 0.000- 0.000. Duration: 0.000. URL:
Saving real-time database ... done.
...

here are the links in the index file
<li><a href="2001/v34/n1/index.html">Vol. 34, no 1 (2001)</a></li>
<li><a href="2000/v33/n2/index.html">Vol. 33, no 2 (2000)</a></li>
<li><a href="2000/v33/n1/index.html">Vol. 33, no 1 (2000)</a></li>
<li><a href="1999/v32/n2/index.html">Vol. 32, no 2 (1999)</a></li>
<li><a href="1999/v32/n1/index.html">Vol. 32, no 1 (1999)</a></li>

these links should be followed and the corresponding web page should be indexed souldn't they ?

robots.txt file is as follow :
#### Generated by a proxy - DeleGate/7.9.10 by [EMAIL PROTECTED]
User-agent: *
Disallow: /-_-
Disallow: /=@=

I try to test another web page of the same web site
MaxHops 1000
MaxDocsPerServer 3500
Server http://www.webpage.com/circuit/

the index file is quiet the same, here are the links in the index file

<li><a href="2001/v11/n3/index.html">Vol. 11, no 3 (2001)</a></li>
<li><a href="2000/v11/n2/index.html">Vol. 11, no 2 (2000)</a></li>
<li><a href="2000/v11/n1/index.html">Vol. 11, no 1 (2000)</a></li>
<li><a href="1999/v10/n2/index.html">Vol. 10, no 2 (1999)</a></li>
<li><a href="1999/v10/n1/index.html">Vol. 10, no 1 (1999)</a></li>

and robots.txt is the same.

but index starts indexing all the links. for this one it is ok

Loading configuration from /usr/local/aspseek/etc/aspseek.conf
( 0 1 1 0 0 0 0 2) Adding URL: http://www.webpage.com/circuit/robots.txt
( 0 1 1 0 0 0 0 2) Adding URL: http://www.webpage.com/circuit/
( 0 1 1 0 0 0 0 2) Adding URL: http://www.webpage.com/circuit/2001/v11/n3/index.html
( 0 1 1 1 0 0 0 2) Adding URL: http://www.webpage.com/circuit//2000/v11/n2/index.html
( 0 1 1 1 0 0 0 2) Adding URL: http://www.web.ca/revues/revues.html
No "Server" command for URL http://www.web.ca/revues/revues.html - deleted.
( 0 1 1 0 0 0 0 2) Adding URL: http://www.webpage.com/circuit/2000/v11/n1/index.html

something wrong ?
what is the difference ?

did someone had the same problem before
I tried to find something similar in the mailing list archive but couldn't find anything :(

hope I am clear.
thanks a lot

Luc.

Reply via email to