Hi,

Thanks for the reply. Problem is script is not roman for the indian regional
language......even the punctuation marks are different...
so how do moses align sentence when it does not know the sentence
terminator.

also moses has a step of lowercasing...there is no concept of lowercasing in
indian regional language....so how should  i do for it?
---
Nirav Shah

On Thu, Sep 18, 2008 at 2:34 AM, Francis Tyers <[EMAIL PROTECTED]> wrote:

> El jue, 18-09-2008 a las 02:30 +0800, Nirav escribió:
> > Hi,
> >
> > I would like to know that how to align the two files one is having
> > Unicode characters ( Indian regional language) and one is having ascii
> > text ( English),
> > also is there any changes needed to train and evaluate the model.
>
> It should Just Work™ -- afaik all the tools work with Unicode text,
> although depending on the regional language in question you might
> benefit from pre-tokenisation.
>
> Fran
>
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to