Hi Soubhik,
Should already be supported.
You have to pass the -encoding utf8 to the command line interface.
James
On 5/30/2012 1:52 PM, Soubhik (সৌভিক) wrote:
> Hi,
>
> I'm trying to use OpenNLP to train a sentence detector for Bengali language
> ("bn"). I would like to add support for Unicode danda character in
> opennlp.tools.sentdetect.lang.Factory
> class. this character is a sentence break in Bengali, Hindi and several
> other Indian languages. the code change should be small (< 10 lines).
>
> Is it correct to think that a change of this size will not require a CLA?
>
> Ref: en.wikipedia.org/wiki/*Danda*
>
> Regards,
> Soubhik.
> --
>