I am interested in using elasticsearch for our website suttacentral.net, 
I've tried ES and found it pleasant to use with obvious power, the only 
challenge is that on suttacentral we host many buddhist texts in ancient 
languages, particularly the pali language, suffix to say there are no 
existing stemmers. Stemming is a vital step for searching because pali is a 
highly inflected language (like latin). The actual stemming step is 
straightforward enough, presently we use a custom stemmer I wrote in 
python, it's dead simple and I wouldn't have much trouble implementing the 
same code in java (i.e. as a function which takes an inflected word as a 
string, and returns the stem as another string). Where I'm in the dark is 
making ES call that code.

All the example stemmer plugins I've found are adapting existing stemmers 
to ES. What I really just want is a way to call a function on each token 
and use the return value of that function. It seems to me that *should* be 
simple enough but I've not managed to find any simple minimalistic code to 
use as a template. Although it would be noble at this point I'm not 
interested in making a proper plugin, I would be happy with the barest 
bodge/hack that would achieve the desired affect!

If anyone could point me in the right direction, either to a minimalistic 
code example, or outline what it would involve, I would be gratefully 
appreciative.

Kind regards,
Nandiya Bhikkhu

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fe2c777e-b823-4652-8f6c-ecf42ec36d33%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to