Cramer, David W (David) wrote:
Hi there,
Kasun Gajasinghe has been hard at work this summer on the webhelp GSoC project
and has implemented all the required features, including stemming for English
and German, highlighting of search results, and tokenization for Asian
(Chinese, Japanese, Korean) languages, freeing the output from the frameset,
and automatic toc synching.
You can see a demo of the results of his efforts and download it to try things
out on your own content from here:
http://www.thingbag.net/docbook/gsoc2010/doc/content/ch02s01.html
The instructions provide links to a version of the package for 4.x and 5.x
documents.
Feedback is welcome. Please let us know what bugs you find. In particular, we
need to test the CJK search support. If you have some demo content in Chinese,
Japanese, or Korean that you can share with us for testing, we'd appreciate it.
I had planned to use the Chinese version of DocBook, The Definitive Guide, but
have had some trouble getting my environment set up so it will build.
We plan to provide instructions for adding stemming support for other non-CJK
languages. For a number of
languages<http://www.thingbag.net/docbook/gsoc2010/doc/content/ch02s04.html>,
all that is required is to port the stemmer from Java to JavaScript so that it can be
used on the client side.
Thanks,
David
Hi David,
First of all, thank you for both of you for your work, it looks very promising!
I have a few questions about how search and stemming works:
- Is it possible to add partial matches to the search results? For example, now
if you search for install, installing, or installed, the same results are
returned (correctly), because these words all come from install. But if you
don't type the entire word (say, only 'inst'), there aren't any results.
- Am I right that the search engine does prefix-only matches? (nstall, *nstall,
etc. does not work)
Regards,
Robert
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]