Re: [Moses-support] Training with multiple text files

Philipp Koehn Wed, 11 Nov 2015 07:30:03 -0800

Hi,

you have to convert your parallel text files yourself into the format that
Moses expects, i.e., two files, one for English, one for Hindi, where
line x in the English file corresponds to line x in the Hindi file.


If this is rather raw data, you may have to run sentence alignment on
your data, using tools such as Hunalign.
http://mokk.bme.hu/en/resources/hunalign/

-phi

On Wed, Nov 11, 2015 at 5:18 AM, Sunayana Gawde <[email protected]>
wrote:

> Hello all,
> I have developed a Baseline Machine Translation system as stated on moses
> website. Also i developed a MT system for a English-Hindi parallel corpus
> available online with which i am getting very low BLEU score i.e.5.31. Now
> i have a parallel text in English and Hindi in health n tourism corpus
> which contains many text files. How to train the system with multiple text
> files? I am only familiar to develop that baseline system. Is there
> something else which i need specifically for Hindi?
> Please help. Thanks.
>
> --
> *Thanks & Regards*
>
> Ms. Sunayana R. Gawde.
>
> DCST, Goa University.
> * P**leas**e don't print t**his e-mail unles**s you really need to.*
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Training with multiple text files

Reply via email to