Hi Nandiya,

Have a look at Lucene and its source-code for token filters.  You'd 
implement a custom stemmer at Lucene level, and then just use that in ES.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/



On Monday, July 7, 2014 8:57:09 PM UTC-4, Nandiya Bhikkhu wrote:
>
> I am interested in using elasticsearch for our website suttacentral.net, 
> I've tried ES and found it pleasant to use with obvious power, the only 
> challenge is that on suttacentral we host many buddhist texts in ancient 
> languages, particularly the pali language, suffix to say there are no 
> existing stemmers. Stemming is a vital step for searching because pali is a 
> highly inflected language (like latin). The actual stemming step is 
> straightforward enough, presently we use a custom stemmer I wrote in 
> python, it's dead simple and I wouldn't have much trouble implementing the 
> same code in java (i.e. as a function which takes an inflected word as a 
> string, and returns the stem as another string). Where I'm in the dark is 
> making ES call that code.
>
> All the example stemmer plugins I've found are adapting existing stemmers 
> to ES. What I really just want is a way to call a function on each token 
> and use the return value of that function. It seems to me that *should* 
> be simple enough but I've not managed to find any simple minimalistic code 
> to use as a template. Although it would be noble at this point I'm not 
> interested in making a proper plugin, I would be happy with the barest 
> bodge/hack that would achieve the desired affect!
>
> If anyone could point me in the right direction, either to a minimalistic 
> code example, or outline what it would involve, I would be gratefully 
> appreciative.
>
> Kind regards,
> Nandiya Bhikkhu
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f3b3a496-b434-41b4-84b9-733b3139202c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to