Hi, typically the training process includes an early stage of automatic word alignment. You can simply skip this step (by running train-model.perl with the option "--first-step 4".
You will have to reform the data from the XML format you have into the Moses word alignment format which is described on the Moses web site. Regards, Philipp 2011/12/12 Sérgio Castro <[email protected]>: > Dear All, > > > We are in the process of manually translating and aligning the WSJ for > Portuguese-English and would like to use the result of this effort to > train a Moses model for automatic translation. > For the manual alignment we are using the Sanchay software by Anil Kumar > Singh which results in its own "XML" format. > We would like to know if there is any way to integrate this aligned > corpora into the Moses training process in order to use this information > to build a translation model. We have converted the format to the format > used in the Giza++ output but still can't seem to use it to feed the > training process. > > Any help will be welcome, > Thank you, > > Sérgio Castro > Department of Informatics > University of Lisbon > Tel: +351 217500738 > [email protected] > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
