Hi, the format is really straightforward.
For details see here: http://www.statmt.org/moses/?n=FactoredTraining.AlignWords -phi 2011/12/15 Sérgio Castro <[email protected]>: > Hello, > > Thank you for your reply. > I was wondering, are there any more extensive examples of the Moses word > alignment format? > > Thank you, > Best regards, > Sérgio Castro > > > On 12/12/2011 09:55 PM, Philipp Koehn wrote: >> >> Hi, >> >> typically the training process includes an early stage of >> automatic word alignment. You can simply skip this step >> (by running train-model.perl with the option "--first-step 4". >> >> You will have to reform the data from the XML format you >> have into the Moses word alignment format which is >> described on the Moses web site. >> >> Regards, >> Philipp >> >> 2011/12/12 Sérgio Castro<[email protected]>: >>> >>> Dear All, >>> >>> >>> We are in the process of manually translating and aligning the WSJ for >>> Portuguese-English and would like to use the result of this effort to >>> train a Moses model for automatic translation. >>> For the manual alignment we are using the Sanchay software by Anil Kumar >>> Singh which results in its own "XML" format. >>> We would like to know if there is any way to integrate this aligned >>> corpora into the Moses training process in order to use this information >>> to build a translation model. We have converted the format to the format >>> used in the Giza++ output but still can't seem to use it to feed the >>> training process. >>> >>> Any help will be welcome, >>> Thank you, >>> >>> Sérgio Castro >>> Department of Informatics >>> University of Lisbon >>> Tel: +351 217500738 >>> [email protected] >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
