Hi all, I am adding search languages support to Sphinx. I get stemming algorithms from Snowball and generete Python and JavaScript versions and merge it into the following branch. But I don't use these languages. If you are native speaker of one of the language or know syntax very well, please help trying this feature and add stop words for the language.
https://bitbucket.org/shibu/sphinx/branch/add_stemmer *What I need:* * Try the search language option see: http://sphinx-doc.org/config.html?highlight=search#confval-html_search_language * Add stop words (pull request or send me the word list). *Added Languages:* ** *Danish(da) * Dutch(nl) * Finnish(fi) * French(fr) * German(de) * Hungarian(hu) * Italian(it) * Norwegian(no) * Portuguese(pr) * Romanian(ro) * Russian(ru) * Spanish(es) * Swedish(sv) * Turkish(tr) *References:* Stemming: Stemming is an important algorithm to make users can find needed document easily. http://en.wikipedia.org/wiki/Word_stem Stop words: These words are not in index. It reduce index size and noize and improve speed. Sphinx specifies the folowing words as stop words. https://bitbucket.org/birkenfeld/sphinx/src/5bf9b44bcd0903b9db510c234dd24d62792570e3/sphinx/search/en.py?at=default#cl-23 Snowball: Stemming algorithm collection. http://snowball.tartarus.org/index.php Python Version: https://pypi.python.org/pypi/snowballstemmer JS(JSX) Version: https://npmjs.org/package/snowball-stemmer.jsx thanks, -- #! /usr/bin/python2 def shibu(shibukawa, yoshiki): web = "http://www.shibu.jp" mail = "[email protected]" twitter = "@shibukawa" return "smile!" -- You received this message because you are subscribed to the Google Groups "sphinx-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/sphinx-users. For more options, visit https://groups.google.com/groups/opt_out.
