Peter Peltonen writes:
> Petri Lankoski wrote:
> 
> > > I have bit problems with htdig and iso characters and I can't find
> > > solution from FAQ to my problem. Htdig DB contains 8bit
> 
> Here's how I got htdig working in Finnish (with ISO characters, that is):
> 
> 1. Configured my htdig.conf:
> 
> locale:               fi_FI.ISO-8859-1
> lang_dir:             /var/lib/htdig/common/finnish
> bad_word_list:        ${lang_dir}/bad_words
> endings_affix_file:   ${lang_dir}/finnish.aff
> endings_dictionary:   ${lang_dir}/finnish.0
> endings_root2word_db: ${lang_dir}/root2word.db
> endings_word2root_db: ${lang_dir}/word2root.db
> 
> 2. Hunted the web and finally found a finnish.dict file. I copied the file
> as finnish.0 to the directory I specified in my htdig.conf (I also created
> that directory :). Copied finnish.aff there too. (If you cannot find these
> files, I can send them to you).
> 
> 3. I made a list of bad words to the file bad_words
> 
> I'm not sure if the machine running htdig has to be configured to be using
> the fi-locale. I don't think so, but I changed that just to be sure.

I tried with instructions above and still htsearch don't find
accented characters. As far as I can see db contains 8-bit characters.

[12:31] xcalibur /var/lib/htdig/db > /www/cgi-bin/htsearch
Enter value for words: m�yr�
Enter value for format: long
Content-type: text/html

<h1>No matches were found for 'm�yr�'</h1>
<p>
Check the spelling of the search word(s) you used.
If the spelling is correct and you only used one word,
try using one or more similar search words with "<b>Any</b>."
</p>

...

[12:32] xcalibur /var/lib/htdig/db > grep m�yr� db.wordlist
m�yr�   i:165   l:561   w:439   a:1
m�yr�   i:170   l:64    w:936
m�yr�   i:259   l:123   w:877
m�yr�   i:260   l:208   w:792
m�yr�   i:269   l:263   w:2146  c:7
m�yr�   i:270   l:237   w:3902  c:13
m�yr�   i:272   l:595   w:405
m�yr�   i:405   l:0     w:250895        c:3
m�yr�   i:406   l:862   w:138
m�yr�   i:418   l:26    w:974
m�yr�   i:84    l:742   w:258
m�yr�   i:85    l:117   w:883
m�yr�   i:90    l:626   w:374
m�yr�koira      i:421   l:697   w:303
m�yr�lle        i:405   l:958   w:42
m�yr�lt�        i:170   l:203   w:797
m�yr�n  i:269   l:247   w:1050  c:2
m�yr�n  i:270   l:615   w:695   c:3
m�yr�n  i:86    l:341   w:1507  c:5
m�yr�n  i:89    l:667   w:333
m�yr��  i:405   l:944   w:56

/etc/htdig/htdig.conf:
locale: fi_FI
lang_dir:             /var/lib/htdig/common/finnish
bad_word_list:        ${lang_dir}/bad_words
endings_affix_file:   ${lang_dir}/finnish.aff
endings_dictionary:   ${lang_dir}/finnish.0
endings_root2word_db: ${lang_dir}/root2word.db
endings_word2root_db: ${lang_dir}/word2root.db


[12:38] xcalibur /var/lib/htdig/db > locale
LANG=fi_FI
LC_CTYPE="fi_FI"
LC_NUMERIC="fi_FI"
LC_TIME="fi_FI"
LC_COLLATE="fi_FI"
LC_MONETARY="fi_FI"
LC_MESSAGES="fi_FI"
LC_ALL=fi_FI


System is Redhat 6.2 and htdig is htdig-3.1.5-0glibc21


-- 
  Petri Lankoski                Yeah you wanna go out 'cause it's raining 
  [EMAIL PROTECTED]                 and blowing * You can't go out cause your 
  http://www.iki.fi/~kreivi/    roots are showing * dye em black
  PGP: http://www.iki.fi/~kreivi/pgp.txt             type o negative



------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  <http://www.htdig.org/mail/menu.html>
FAQ:            <http://www.htdig.org/FAQ.html>

Reply via email to