Dear All, Thanks to everyone who responded to my mail on questions on giza++. Here is a summary of the correspondences:
Question 1: > I am working on a MT project, which uses Giza++ to build the translation > model, and I have some questions about it: If I make changes in the > alignment file (*.A3.*), which is one of Giza++'s output files, would it > affect the decoding subsequently? If not, is there any way (or any > output files that I can change) to improve the alignment in the > translation model? No, *.A3.* file include only sample of the best (Vitebi) alignment for each sentence, it doesn't have any influence on decoding process. Decoding with Model3 uses: *.a3.final, *.d3.final, *.n3.final, *.p0_3.final, *.t3.final Decoding with Model4 uses: *.a3.final, *.d4.final, *.n3.final, *.p0_3.final, *.t3.final Thanks to Jan Curin --------------------------- Alignment can be improved by the use of a dictionary file which biases the alignment model. Thanks to Paul Johnston ---------------------------- One could tweak the probability files, which is used by the ISI decoder. But it's probably better to improve alignments by adjusting the number of training iterations one uses for each of the IBM Models, or by doing pre- and post-processing of the training data. Thanks to Chris Callison-Burch -------------------------------------- -------------------------------------- Question 2: > Also, when Giza++ talks about source language and target language, does it > mean translation source and target, or NCM (Noise Channel > Model) source and target, which is the other way around. For example, if I > am translating from English to Chinese, should it be source:English, > target:Chinese, as in the normal translation terminology; or > source:Chinese, target:English, as in the NCM terminology? Giza++ uses the NCM definition of source and target, therefore, in the above example, Chinese is the source and English the target. Thanks to Jan Curin, Paul Johnston and Chris Callison-Burch ----------------- Also thank Colin Brace for his kind reminder :) ************************* Huiling Jin CS Dept. SUNY Stony Brook -- For MT-List info, see http://www.eamt.org/mt-list.html
