Hi, Thank your for your suggestions.
I have done some test. It showed both English to Chinese and Chinese to English training would failed if I did not do any measures. Suzy and Tom gave me a useful advice that do something like segment. The further question is, how to do segment? Could anybody who has the experience of training corpus either from English to Chinese or from Chinese to English give me some idea? Thank you very much. Regards, James 2011/8/17 Tom Hoar <[email protected]> > I agree with Suzy. Also, if your translation requests are not > segmented, it's possible that the training corpus was also not > segmented. Verify that your training corpus, develop and test sets were > all segmented when you trained/tuned your translation model. If not, > you'll need to start from the beginning. > > Tom > > > On Wed, 17 Aug 2011 19:28:17 +1000, Suzy Howlett <[email protected]> > wrote: > > Hi James, > > > > It looks like the text has not been segmented into words, so it > > thinks > > every sentence is a single word. Unless the sentences you are trying > > to > > translate are identical to some sentences in the training corpus, it > > will think every test sentence is an unknown word it's never seen > > before. You'll need to use some kind of word segmentation. > > Unfortunately > > I don't know anything about that area, so I have no useful > > suggestions. > > > > Best, > > Suzy > > > > On 17/08/11 7:13 PM, 蒋乾 wrote: > >> *Hi all, > >> * > >> *When I used MT to do translation from Chines to English, I meet an > >> unexpected problem.Could you please tell * > >> *me the reason if you have any idea about it?* > >> ** > >> *I trained a big amount of paralleled corpus about 2,600,000 lines > >> on a > >> computer with 5GB RAM.* > >> *After that, I tried translating a small Chinese file about 80 lines > >> into English.Unexpectedly, it didn't work.* > >> *It did not do any translation work at all. The target file I got > >> was as > >> same as the source file.* > >> ** > >> *One sample line of the information shown on the screen during MT's > >> traslation is as follows,* > >> > >> " > >> Translating: 使用文本索引查询视图 > >> Collecting options took 0.000 seconds > >> Search took 0.000 seconds > >> BEST TRANSLATION: 使用文本索引查询视图|UNK|UNK|UNK [1] > >> [total=-99.978] <<0.000, -1.000, -100.000, 0.000, 0.000, 0.000, > >> 0.000, 0.000, 0.000, -7.346, 0.000, 0.000, 0.000, 0.000, 0.000>> > >> Translation took 0.000 seconds > >> Finished translating > >> Translating: 使用文本索引查询视图关于 > >> Collecting options took 0.000 seconds > >> Search took 0.000 seconds > >> BEST TRANSLATION: 使用文本索引查询视图关于|UNK|UNK|UNK [1] > >> [total=-99.978] <<0.000, -1.000, -100.000, 0.000, 0.000, 0.000, > >> 0.000, 0.000, 0.000, -7.346, 0.000, 0.000, 0.000, 0.000, 0.000>> > >> Translation took 0.000 seconds > >> Finished translating > >> " > >> > >> *It is very appreciated if you could tell me the reason why it > >> happens > >> and the way how to solve it.* > >> ** > >> *Thank you very much.* > >> ** > >> *Regards,* > >> *James* > >> > >> > >> _______________________________________________ > >> Moses-support mailing list > >> [email protected] > >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
