> Are the hits all capitalized, or do some of them have the lowercase �?
> Does this problem happen consistently with certain accented letters, and
> not others?  Do you have certain uppercase letters appearing in db.wordlist?

With hits you mean the actual words from the document I guess. Well only those 
which are supposed to be capitalized are. For example: A search for "�ttestupan" 
renders 0 hits while a search for "�ttestupan" renders 18. The word is in the documents
always written as "�ttestupan" so this would be natural if the search was case 
sensitive.
The problem is that "�sa" and "�sa" gives the exact same hits and it's also always 
reffered to as "�sa". The problem only exists (as far as I can test) for "��".

The db.wordlist only contain lowercase letters.

> > I asked a guy here a the University and he said that there might be
> > complications with "unsigned char" and "char". He gave me the example
> > below. Please answer at a novice level, my C++ and Unix knowledge is very
> > limited.  
> 
> Good hunch, but given that some accented letters work and some give
> problems, I wouldn't expect that it's a problem with sign extension.
> This seems to point to a problem with the ctype tables for your locale,
> but there could be something else that I'm missing here.  Please keep
> us posted.

I'm also looking for a synonym wordlist in swedish... If anyone has one, please 
send me a copy.


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You'll receive a message confirming the unsubscription.

Reply via email to