Thank you very much for your answer.But i'm new to this field and i'm not aware about how to create nonbreaking_prefixfiles.Is there any perticular way of doing this.Can you explain me something more.
On Wed, May 30, 2012 at 6:13 PM, Tom Hoar < [email protected]> wrote: > Build your own nonbreaking_prefixes file. Name it with the extension you > want to use and save it in the nonbreaking_prefixes subfolder under the > moses scripts/tokenizer folder. The existing files are commented with > instructions to help you. > > Tom > > > > On Wed, 30 May 2012 17:37:19 +0530, tharaka weheragoda < > [email protected]> wrote: > > Hi everybody, > > When i'm trying to tokenize my sinhala dataset it gives me a warning > message like this > "WARNING: No known abbreviations for language 'si', attempting fall-back > to English version..." > > And my letters have changed a bit. Is their anyway to tokenize sinhala > data with this tokenizer.perl ? > > I'm looking forward for your help. > > Thanks in advance! > Tharaka > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
