According to [EMAIL PROTECTED]:
> Im having trouble with the server_aliases however... ive tried a number of
> different combinations from searching the mailling list archive and reading the  
>online help, but cant seem to get it right....
> 
> Our site has 16 aliases (8 really, but both WWW and no WWW refer to the same 
>site)... all of which point to the same site
> 
> The problem is when i search for say the world 'help'  it'll retrieve 5 or 6 
>dupilicates the only thing different being the URL pointing to this page.
> This leads me to believe alot of duplication might be going on, and the database is 
>larger then it needs to be.. (not to mention the duplicate results returned to the 
>user)
> 
> here is what the relative keywords configuration are set as:
> 
> 
> limit_urls_to:  nettrash.com netjunk.com netgarbage.com nettoilet.com
> #limit_urls_to:  internettrash.com    (also tried just internettrash.com with no 
>luck)
> 
> limit_normalized: http://internettrash.com    
>                      
> start_url:              http://internettrash.com/ 
>http://internettrash.com/userlist.html                                                
>                                                
> allow_virtual_hosts: false    
> 
> server_aliases: www.internettrash.com=internettrash.com \            
>                 www.internetgarbage.com=internettrash.com \          
>                     internetgarbage.com=internettrash.com \        
>               www.netgarbage.com=internettrash.com \               
>                     netgarbage.com=internettrash.com \               
>                 www.internetjunk.com=internettrash.com \             
>                     internetjunk.com=internettrash.com \             
>                 www.netjunk.com=internettrash.com \                  
>                     netjunk.com=internettrash.com \               
>                 www.internettoilet.com=internettrash.com \           
>                     internettoilet.com=internettrash.com \           
>                 www.nettoilet.com=internettrash.com \                
>                     nettoilet.com=internettrash.com \                
>                 www.nettrash.com=internettrash.com \                 
>                     nettrash.com=internettrash.com          

I believe server_aliases require both the server name and port number.
At least, every example of it I've seen include both.  You'd need to
append ":80" to every ".com" in your list above, on either side of the
"=" sign.  E.g., here's what I use:

start_url:      http://www.scrc.umanitoba.ca/ \
                http://www.scrc.umanitoba.ca/SCRC/mugshots/preformatted/index.html
limit_urls_to:  scrc.umanitoba.ca/
server_aliases: scrc.umanitoba.ca:80=www.scrc.umanitoba.ca:80 \
                cliff.scrc.umanitoba.ca:80=www.scrc.umanitoba.ca:80

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to