> NT> yes, I run ./indexer -C to clear it and then I start indexer again
> NT> and I end up with the above result..
> 
> Its strange. I have no ideas.
> 
Incidentally I just downloaded 3.2.3 to try it and index pages are
still showing up in search results.

I rebuilt the database from scratch for the new version and ran
indexer for about 10 seconds and I have results:
  * http://www.austinchronicle.com/issues/dispatch/2002-01-04/

  * http://www.austinchronicle.com/issues/dispatch/2002-01-11/

which I would think my HrefOnly command would negate... 

Here is my complete indexer.conf file


# indexer.conf
# edited by lindsey simon for The Austin Chronicle

DBAddr  XXXXXX  

# Yeah, this rulez
DBMode crc-multi

Robots no

LocalCharset iso-8859-1
CharSet iso-8859-1

# Ispell   
#Affix en en.aff
#Spell en en.dict

StopwordFile stopwords/en.big.sl

#ReadTimeOut 99999999 
#MaxNetErrors 9999999999999

# MinWordLength 3 

# WEIGHTS
# Standard HTML sections: body, title, desctiption, keywords

Section body                    1
Section title                   2
Section description             3
Section keywords                4


# Document's URL parts

Section url:file                5
Section url:path                6
Section url:host                7
Section url:proto               8


# Our site
AuthBasic guest:guest
Server  http://www.austinchronicle.com/issues/
Alias http://www.austinchronicle.com/issues/ http://daisy/issues/ 

#UseRemoteContentType no
AddType text/plain *

# Do not index the index pages
HrefOnly \/$
HrefOnly \/index\.html$ \.index\.html$ \_index\.html$ \/*_index\.html$
HrefOnly \/.*index\.html$
HrefOnly \/1999\.html$ \/2000\.html$ \/1998\.html$ \/1997\.html$ \/1996\.html$
HrefOnly \/1995\.html$ \/2001\.html$ \/2002\.html$ \/film.html$ \/adverts\.html$
HrefOnly \/arts\.listings\/ \/music\.clubs\.html$ \/screens\.film\.html$
HrefOnly \/Film\d\d\.html$ \/screens\.filmtimes\.html$ \/showtimes\.html$
HrefOnly \/clubs\.html \/music\.clubs\/ \/music\.roadshows\.html$


# Get rid of junk
Disallow \.bak$ /nav/ /temp/ /current/ /deep_focus/ /not_current/
Disallow /authors/ /calendar/ /filmvault/ /tmp/
Disallow /weeklywire\.com/ /bin/ /etc/ /lib/ /musicreg/\d
Disallow /cgi-bin/ /cgi/ /images/ /temp/

# Allow some known extensions and directory index
Allow \.html$ \.htm$ \.php$

# Disallow everything else
Disallow .*
___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to