Dear all, We are implementing some Thai language capabilities. The OpenNLP trained models available at http://opennlp.sourceforge.net/models/thai/ (tokenization, POS tagging, sentence detection) seem to be quite old (built under version 1.4). We are using 1.6. Does anyone know which Thai corpora were used to train these models? If I can find the corpora, I can retrain the models with the newer version of OpenNLP.
Alternatively, if anyone has models for Thai (tokenization, POS tagging, sentence detection) that are compatible with 1.6 of OpenNLP and is willing to share, that would be helpful. Shane