Title: Message
I've noticed the following which may, or may not be "real" bugs when testing the new Beta version:
 
1] The htdig.conf file seems to be very, very sensitive to whitespace at the end of lines. In particular, with a multiline attribute as illustrated just below, if there is white space (tested with [tab]s)  after the \  character,   htdig _and_ htsearch will fail:
 
server_aliases:                 www.cbfm.rbs.co.uk=www.cbfm.rbsgrp.net                                                                  
                                www.cib.rbs.co.uk=www.cib.rbsgrp.net
 
2] I can't seem to get any sensible changes to results with htsearch using url_seed_score
 
url_seed_score:                 cbfm|fmintranet|cib. *500,+1000 \                                                                              
                                manufacturing.|retail|technology.|wealthmanagement.|rbs.|group *.1,
 
Even stupidly high factors don't seem to have an effect (like 100,000).  (tried with and without commas and spaces separating values)
 
3] If there is _not_ a return after the last line in the config file then htsearch causes a cgi error.  Results from apache eror log:
 
Unknown char in line 224: #[Fri Nov 14 23:51:46 2003] [error] [client 147.114.74.200] malformed header from script. Bad header=syntax error: /var/www/cgi-bin/htsearch32
 
4] If you search for a phrase and it forms part of a longer string then the results are not highlighted in the extract displayed.  This is most apparent when the second word is singular, but it finds a plural result.
 
Search for "animal feedstuff"
finds "animal feedstuff"s   ---  no highlight
finds "animal feedstuff"    --- highlight as expected
 
Hope this makes sense!
 
Lastly, are the cookies.txt mechanism and check_unique_md5  actually known to work?
 
Running 3.2.0b5 on:  Linux lon3561xus 2.4.9-31smp #1 SMP Tue Feb 26 06:55:00 EST 2002 i686 unknown
 
It has happily indexed multi server intranet with about <50k pages,  including parseing PDFs and Word docs - but, as ever, seems limited by my web server responses/network latentcy, so this took over 18 hours.  I'm really very happy with what I've seen so far - especially the phrase search which is crucial for me to keep this product in place.
 
Best regards
Nicholas Booth
Royal Bank of Scotland, Corporate Banking
280 Bishopsgate
London
 


***********************************************************************************
This e-mail is intended only for the addressee named above.
As this e-mail may contain confidential or privileged information,
if you are not the named addressee, you are not authorised to
retain, read, copy or disseminate this message or any part of it.
The Royal Bank of Scotland plc is registered in Scotland No 90312
Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB
Regulated by the Financial Services Authority

Visit our website at http://www.rbs.co.uk/CBFM/
***********************************************************************************

Reply via email to