I am working on Statistical and Hybridized Methods for Machine Translation in Indian languages. An associate of mine has recently submitted a paper for cicling describing an alignment algorithm tailored for Indian languages(English-Hindi pair to be exact). The algorithm reports slightly better accuracies in comparison to GIZA++ and handles many-to-many alignments particularly well. We would like to use the alignments obtained using this algorithm in Moses, to extract the phrase tables and further for hierarchical MT too.
I am familiar with the architecture of Moses, and know that the 2nd and 3rd steps involve computing alignments in both directions while the 4th step applies the heuristic(grow,union ...) to obtain the final alignments. These alignments are further used to extract the phrase-pairs. Now my question is, what would be the best way to incorporate the alignments into Moses. One way would be to duplicate the files generated by GIZA++ in both step 2 and 3, and start the training procedure from step:4. However, I was wondering is there was a much simpler method to use the customized alignments in Moses. Also in the process of MT, if I wanted to use a bilingual dictionary, would it be ideal to use the dictionary in GIZA++ while computing the alignments, or to augment the corpus with the entries in the dictionary. Most of the target words for the entries in the dictionary are MWEs, and hence augmenting the corpus did not bring about any improvements when we conducted the experiments. Could you kindly suggest an appropriate method to be used in this context. - Regards, Prasanth
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
