Assuming your input, "由于时间因素至关重要" contains more than one word, it looks like you have not word-segmented/tokenized your Chinese phrases. If you trained your corpus without segmentation, Moses/GIZA++ will treat your Chinese source langauge as though each sentence has one word. This would cause the results you're getting. Try using the Stanford Segmenter http://nlp.stanford.edu/projects/chinese-nlp.shtml [1], to segment/tokenize your Chinese half and re-trainin.
If you have segmented your Chinses half, then is it possible that you have accidentially trained your model with English as the source language and Chinese as target? On Wed, 16 May 2012 18:49:25 +0800, 马洪宾 wrote: is it because that my training corpus is too small? For performance I only use 90000 sentences. I chose those in the phrase-table to check it out, like "我" and "我国政府" and when I try "我" it can translate to "I", but it still can't translate "我国政府" (even if it's in the phrase table) is it normal at all? thanks! ---------- Forwarded message ---------- From: 马洪宾 Date: Wed, May 16, 2012 at 6:22 PM Subject: Re: 答复: 答复: [Moses-support] UPDATED: moses training error To: [email protected] [3] Hey, guys, I believe my previous problem was caused by some noise in my corpus. I've tackled it now. Now I've passed the training process, (no tuning yet), But I've got a moses.ini in my train/model/ directory anyway. I use this moses.ini to run a test(according to the official tutorial, this should make sense) hongbin@ubuntu:~/working1/train/model$ echo "由于时间因素至关重要"|~/mosesdecoder/dist/bin/moses -f moses.ini Defined parameters (per moses.ini or switch): config: moses.ini distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6 /home/hongbin/working1/train/model/reordering-table.wbe-msd-bidirectional-fe.gz distortion-limit: 6 input-factors: 0 lmodel-file: 8 0 3 /home/hongbin/lm/corpus.blm.en mapping: 0 T 0 ttable-file: 0 0 0 5 /home/hongbin/working1/train/model/phrase-table.gz ttable-limit: 20 weight-d: 0.3 0.3 0.3 0.3 0.3 0.3 0.3 weight-l: 0.5000 weight-t: 0.20 0.20 0.20 0.20 0.20 weight-w: -1 Loading lexical distortion models...have 1 models Creating lexical reordering... weights: 0.300 0.300 0.300 0.300 0.300 0.300 Loading table into memory...done. Start loading LanguageModel /home/hongbin/lm/corpus.blm.en : [72.000] seconds Finished loading LanguageModels : [73.000] seconds Start loading PhraseTable /home/hongbin/working1/train/model/phrase-table.gz : [73.000] seconds filePath: /home/hongbin/working1/train/model/phrase-table.gz Finished loading phrase tables : [73.000] seconds Start loading phrase table from /home/hongbin/working1/train/model/phrase-table.gz : [73.000] seconds Reading /home/hongbin/working1/train/model/phrase-table.gz ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 **************************************************************************************************** Finished loading phrase tables : [108.000] seconds IO from STDOUT/STDIN Created input-output object : [108.000] seconds Translating line 0 in thread id 140004348253952 Translating: 由于时间因素至关重要 Collecting options took 0.000 seconds Search took 0.000 seconds 由于时间因素至关重要 BEST TRANSLATION: 由于时间因素至关重要|UNK|UNK|UNK [1] [total=-104.508] 0-0 Translation took 0.000 seconds Finished translating It seems that it has not even tried to translate from chinese to engish! what's wrong with this?I checked those phase-table and language model file, it seems to be normal. could you please help me on this? Hongbin On Wed, May 16, 2012 at 1:13 PM, lixianhua wrote: There's a clean-corpus-n.perl in moses, find it and clean your corpus like: ./clean-corpus-n.perl corpus l1 l2 clean-corpus 1 100 发件人: 马洪宾 [mailto:[email protected] [5]] 发送时间: 2012年5月16日 13:09 收件人: lixianhua 主题: Re: 答复: [Moses-support] UPDATED: moses training error I think you're right, do you have any batch to run the cleaning? On Wed, May 16, 2012 at 12:10 PM, lixianhua wrote: There must be something wrong with your extract process I suggest cleaning your corpus, as well as deleting the | [ ] characters in your corpus Then run the train script 发件人: [email protected] [7] [mailto:[email protected] [8]] 代表 马洪宾 发送时间: 2012年5月16日 11:28 收件人: [email protected] [9] -- Hongbin MA(马洪宾) Department of Computer Science and Engineering, Shanghai Jiao Tong University. Mobile: (86)188-1755-4825 Hi, I'm trying out a chinese-english baseline system using the latest moses. I'm running it on a Ubuntu server 64bit. Although I followed strictly to the tutorial http://www.statmt.org/moses/?n=Moses.Baseline [10], when I'm proceding the phrase " training the translation system", I get the info "ERROR: train/model/extract.o.sorted.gz does not exist in ~/working/train/model" and the program exit with exit code 2. However, I do find that there's a file named extract.sorted.gz in ~/working/train/model.(slightly different, not o.sorted.gz, but sorted.gz) $ls -l : -rw-rw-r-- 1 hongbin hongbin 30674272 May 15 16:08 aligned.grow-diag-final-and -rw-rw-r-- 1 hongbin hongbin 20 May 15 16:10 extract.inv.sorted.gz -rw-rw-r-- 1 hongbin hongbin 20 May 15 16:10 extract.sorted.gz(but the size seems to be too small) -rw-rw-r-- 1 hongbin hongbin 61246318 May 15 16:10 lex.e2f -rw-rw-r-- 1 hongbin hongbin 61246318 May 15 16:10 lex.f2e -rw-rw-r-- 1 hongbin hongbin 2 May 15 16:10 phrase-table.gz Could you please give me any clew to fix this? PS, I'm running this step by: nohup nice ~/mosesdecoder/dist/training/train-model.perl -root-dir train -corpus ~/corpus/corpus-clean -f ch -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 0:3:$HOME/lm/corpus.blm.en:8 >& training.out & (Any problem with this command?) Thanks! Hongbin -- Hongbin MA(马洪宾) Department of Computer Science and Engineering, Shanghai Jiao Tong University. Mobile: (86)188-1755-4825 _______________________________________________ Moses-support mailing list [email protected] [11] http://mailman.mit.edu/mailman/listinfo/moses-support [12] 主题: [Moses-support] UPDATED: moses training error -- Hongbin MA(马洪宾) Department of Computer Science and Engineering, Shanghai Jiao Tong University. Mobile: (86)188-1755-4825 -- Hongbin MA(马洪宾) Department of Computer Science and Engineering, Shanghai Jiao Tong University. Mobile: (86)188-1755-4825 -- Hongbin MA(马洪宾) Department of Computer Science and Engineering, Shanghai Jiao Tong University. Mobile: (86)188-1755-4825 Links: ------ [1] http://nlp.stanford.edu/projects/chinese-nlp.shtml [2] mailto:[email protected] [3] mailto:[email protected] [4] mailto:[email protected] [5] mailto:[email protected] [6] mailto:[email protected] [7] mailto:[email protected] [8] mailto:[email protected] [9] mailto:[email protected] [10] http://www.statmt.org/moses/?n=Moses.Baseline [11] mailto:[email protected] [12] http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
