On Wed, Mar 9, 2011 at 1:36 PM, Peter Desjardins <peter.desjardins.us@
gmail.com> wrote:

> Hi.
>
> I'm producing webhelp output
> (http://www.thingbag.net/docbook/gsoc2010/doc/content/index.html) and
> I noticed that when I search for the term "nucleus," the webhelp
> search function removes the letter s and searches for "nucleu."
> "Nucleus" is a commonly used term in my document. I see the same
> behavior with the search term "zeus" and "tutus" becomes "tutu."
>
> Is this a configurable behavior? Is the search function purposely
> simplifying my terms?
>

Hi Peter,

The searching happens for the stemmed words of the given query. i.e. it
purposely get the root words of the given search terms to provide better
searching support. Link [1] has an small introduction on what stemmer does
and the limitations it has. WebHelp uses Porter stemmer for English [2], and
Snowball stemmers for several other languages [3].

Does it return false results for 'nucleu' when searched for 'nucleus'? We
tested the search with stemming, and it worked as expected except some few
glitches which is ignorable compared to the power it adds!

[1]
http://blog.kasunbg.org/2010/10/javascript-stemmer-for-french-language.html
[2] http://snowball.tartarus.org/algorithms/porter/stemmer.html
[3]
http://docbook.sourceforge.net/release/xsl/current/webhelp/docs/content/ch03s02.html


--Kasun


>
> Thanks.
>
> Peter Desjardins
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

-- 
~~~*******'''''''''''''*******~~~
Kasun Gajasinghe,
University of Moratuwa,
Sri Lanka.
Blog: http://blog.kasunbg.org
Twitter: http://twitter.com/kasunbg

Reply via email to