Antonello Piemonte wrote:
>
> Hello people,
>
> I am setting up htdig for a project here and I am trying
> to index properly german umlauts ...
>
> what I've done so far: according to http://www.htdig.org/FAQ.html#q4.10
> I made the following entries on htdig.conf
>
> # ---------------- debut de germanization ----------------------
>
> locale: de_DE
> lang_dir: /usr/local/htdig/german
> #bad_word_list: ${lang_dir}/bad_words
> endings_affix_file: ${lang_dir}/german.aff
> endings_dictionary: ${lang_dir}/zusammen.txt.sq
> #endings_root2word_db: ${lang_dir}/root2word.db
> #endings_word2root_db: ${lang_dir}/word2root.db
> # ---------------- end germanization ---------------------------
>
> the dictionary and affix word I have taken from
> http://fmg-www.cs.ucla.edu/geoff/ispell-dictionaries.html
> namely I tried Heinz Knutzen's one and also Martin Schulz's (first and
> third from the list there). I just copied the dictionary and .aff files
>
> in ${lang_dir} and then run the "rundig" script and tried the search
> but no results .....
>
> the point is that I don't know if I should manipulate those dictionary
> files (use the hashed version?) or use them as they are. I even tried
> "locale: de_DE.ISO_8859-1" and "locale: de" in htdig.conf ....
>
> final notes: running on FreeBSD 4.2 with htdig3.2.0b3, maybe I should
> use
> the htdig stable release ... ?
>
> TIA
> antonello
>
> _______________________________________________
> htdig-general mailing list <[EMAIL PROTECTED]>
> To unsubscribe, send a message to <[EMAIL PROTECTED]> with
>a subject of unsubscribe
> FAQ: http://htdig.sourceforge.net/FAQ.html
There is a revised version of Heinz Knutzen's dictionary at
http://www.suse.de/~bjacke/igerman98/. This is what I'm using. Just
uncompress the package and read the file 'INSTALL'. Edit the makefile as
described and run make. Copy the files 'german.aff' and 'all.words' to
your 'language_dir'. Set 'language_dir' in your configuration and add
the following lines:
locale: de
endings_affix_file: ${lang_dir}/german.aff
endings_dictionary: ${lang_dir}/all.words
This works for me, and if I'm looking for 'K�ln', I find both: 'K�ln'
and 'Koeln'!
Berthold Cogel
--
Dr. rer. nat. Berthold Cogel University of Cologne
E-Mail: [EMAIL PROTECTED] ZAIK-US (RRZK)
Tel.: +49(0)221/478-7020 Robert-Koch-Str. 10
FAX: +49(0)221/478-5568 D-50931 Cologne - Germany
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html