Hi
I am using the DocBook XSL stylesheets (version 1.78.1) to produce
Webhelp, and my documents are being translated into French, Japanese,
Korean, and Simplified Chinese.
I have a couple of questions about configuring the Webhelp search which
do not seem 100% obvious to me, having looked through the Webelp docs.
(1) The Webhelp XSL templates always output a link to a JavaScript
stemmer library. The file name of the library linked to is determined by
the webhelp.indexer.language parameter. But Webhelp only includes
stemmers for en, fr and de languages.
Question 1: Is it OK to use the default "en" JavaScript stemmer with
non-English locales, or is it best to customize the template that
outputs the stemmer link and remove the link for languages that do not
have a stemmer?
(2) The Java indexer command used with the Webelp build has the
properties webhelp.indexer.language and enable.stemming.
In trying to establish a list of languages that have Java stemmer
support, the Webhelp docs have this:
- In the section "Adding support for other (non-CJKV) languages") there
is a list of non-CJKV languages that have stemmer support but no
language codes.
- In the section "Search indexing" it says look in the build.properties
file for the language code, but the build.properties file says look in
the docs.
- In the section "New Stemmers" (in the developer docs) it seems to
indicate a different list of languages with stemmers, with a list of
language codes (including "cn" for Chinese?).
Question 2: If the enable.stemming property if set to true, is the value
of webhelp.indexer.language used to determine whether a Java stemmer is
used?
Question 3: Is there a definitive list of language codes that the Java
indexer expects/accepts/supports for the language?
Question 4: If a language has no Java stemmer, is it best to set the
enable.stemming property to "false", or does it not really matter?
Thanks