HI Yaqin: Source side word lattice might help in this case, please refer to the related section in the following paper:
Christopher Dyer, Smaranda Muresan, Philip Resnik, Generalizing Word Lattice Translation. In Proceedings of ACL-08: HLT (June 2008), pp. 1012-1020 Best regards,* Jie Jiang* Senior Language Technology Specialist Capita Translation and Interpreting Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ | Tel (UK): +44 845 367 7000 | Tel (US): +1 (800) 579-5010 Tel Direct: +44 (0)844 854 8984 | [email protected] | Skype ID: jie.jiang-capita-ti www.capitatranslationinterpreting.com 2013/1/18 Yaqin <[email protected]> > Dear all, > > I'm using moses phrase-bases system to translate from Chinese to English. > > I found a lot unknown words in the translation results of test data > are caused by the segmentation differences between the training data > and test data on the Chinese side. > > For example "全球化" (globalization) is segmented as one word in the test > data, while it's segmented into two words "全球" and "化" in the training > data. Thus, "全球化" is not recognized and failed to be translated. > > Does anyone have any suggestion on this problem? > > Thanks, > Yaqin > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
