is it because that my training corpus is too small? For performance I only use 90000 sentences.
I chose those in the phrase-table to check it out, like ”我“ and ”我国政府“ and when I try "我" it can translate to "I", but it still can't translate "我国政府“ (even if it's in the phrase table) is it normal at all? thanks! ---------- Forwarded message ---------- From: 马洪宾 <[email protected]> Date: Wed, May 16, 2012 at 6:22 PM Subject: Re: 答复: 答复: [Moses-support] UPDATED: moses training error To: [email protected] Hey, guys, I believe my previous problem was caused by some noise in my corpus. I've tackled it now. Now I've passed the training process, (no tuning yet), But I've got a moses.ini in my train/model/ directory anyway. I use this moses.ini to run a test(according to the official tutorial, this should make sense) hongbin@ubuntu:~/working1/train/model$ echo "由于时间因素至关重要"|~/mosesdecoder/dist/bin/moses -f moses.ini Defined parameters (per moses.ini or switch): config: moses.ini distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6 /home/hongbin/working1/train/model/reordering-table.wbe-msd-bidirectional-fe.gz distortion-limit: 6 input-factors: 0 lmodel-file: 8 0 3 /home/hongbin/lm/corpus.blm.en mapping: 0 T 0 ttable-file: 0 0 0 5 /home/hongbin/working1/train/model/phrase-table.gz ttable-limit: 20 weight-d: 0.3 0.3 0.3 0.3 0.3 0.3 0.3 weight-l: 0.5000 weight-t: 0.20 0.20 0.20 0.20 0.20 weight-w: -1 Loading lexical distortion models...have 1 models Creating lexical reordering... weights: 0.300 0.300 0.300 0.300 0.300 0.300 Loading table into memory...done. Start loading LanguageModel /home/hongbin/lm/corpus.blm.en : [72.000] seconds Finished loading LanguageModels : [73.000] seconds Start loading PhraseTable /home/hongbin/working1/train/model/phrase-table.gz : [73.000] seconds filePath: /home/hongbin/working1/train/model/phrase-table.gz Finished loading phrase tables : [73.000] seconds Start loading phrase table from /home/hongbin/working1/train/model/phrase-table.gz : [73.000] seconds Reading /home/hongbin/working1/train/model/phrase-table.gz ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 **************************************************************************************************** Finished loading phrase tables : [108.000] seconds IO from STDOUT/STDIN Created input-output object : [108.000] seconds Translating line 0 in thread id 140004348253952 Translating: 由于时间因素至关重要 Collecting options took 0.000 seconds Search took 0.000 seconds 由于时间因素至关重要 BEST TRANSLATION: 由于时间因素至关重要|UNK|UNK|UNK [1] [total=-104.508] <<0.000, -1.000, -100.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, -11.015, 0.000, 0.000, 0.000, 0.000, 0.000>> 0-0 Translation took 0.000 seconds Finished translating It seems that it has not even tried to translate from chinese to engish! what's wrong with this?I checked those phase-table and language model file, it seems to be normal. could you please help me on this? Hongbin On Wed, May 16, 2012 at 1:13 PM, lixianhua <[email protected]> wrote: > There’s a clean-corpus-n.perl in moses, find it and clean your corpus > like:**** > > ** ** > > ./clean-corpus-n.perl corpus l1 l2 clean-corpus 1 100**** > > ** ** > > ** ** > > *发件人:* 马洪宾 [mailto:[email protected]] > *发送时间:* 2012年5月16日 13:09 > *收件人:* lixianhua > *主题:* Re: 答复: [Moses-support] UPDATED: moses training error**** > > ** ** > > I think you're right, do you have any batch to run the cleaning?**** > > On Wed, May 16, 2012 at 12:10 PM, lixianhua <[email protected]> > wrote:**** > > There must be something wrong with your extract process**** > > I suggest cleaning your corpus, as well as deleting the | [ ] characters > in your corpus**** > > Then run the train script**** > > **** > > *发件人:* [email protected] [mailto:[email protected]] > *代表 *马洪宾 > *发送时间:* 2012年5月16日 11:28 > *收件人:* [email protected]**** > > *主题:* [Moses-support] UPDATED: moses training error**** > > **** > > **** > > Hi,**** > > **** > > I'm trying out a chinese-english baseline system using the latest moses.** > ** > > I'm running it on a Ubuntu server 64bit.**** > > Although I followed strictly to the tutorial > http://www.statmt.org/moses/?n=Moses.Baseline, when I'm proceding the > phrase " training the translation system", I get the info**** > > "ERROR: train/model/extract.o.sorted.gz does not exist in > ~/working/train/model" and the program exit with exit code 2.**** > > **** > > However, I do find that there's a file named extract.sorted.gz in > ~/working/train/model.(slightly different, not o.sorted.gz, but sorted.gz) > **** > > $ls -l :**** > > -rw-rw-r-- 1 hongbin hongbin 30674272 May 15 16:08 > aligned.grow-diag-final-and**** > > -rw-rw-r-- 1 hongbin hongbin 20 May 15 16:10 extract.inv.sorted.gz** > ** > > -rw-rw-r-- 1 hongbin hongbin 20 May 15 16:10 extract.sorted.gz(but > the size seems to be too small)**** > > -rw-rw-r-- 1 hongbin hongbin 61246318 May 15 16:10 lex.e2f**** > > -rw-rw-r-- 1 hongbin hongbin 61246318 May 15 16:10 lex.f2e**** > > -rw-rw-r-- 1 hongbin hongbin 2 May 15 16:10 phrase-table.gz**** > > **** > > Could you please give me any clew to fix this?**** > > **** > > PS,**** > > I'm running this step by:**** > > nohup nice ~/mosesdecoder/dist/training/train-model.perl -root-dir train > -corpus ~/corpus/corpus-clean -f ch -e en -alignment grow-diag-final-and > -reordering msd-bidirectional-fe -lm 0:3:$HOME/lm/corpus.blm.en:8 >& > training.out &**** > > (Any problem with this command?)**** > > **** > > Thanks!**** > > Hongbin**** > > **** > > **** > > **** > > -- > Hongbin MA(马洪宾)**** > > Department of Computer Science and Engineering, > Shanghai Jiao Tong University. > Mobile: (86)188-1755-4825**** > > **** > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support**** > > > > **** > > **** > > -- > Hongbin MA(马洪宾)**** > > Department of Computer Science and Engineering, > Shanghai Jiao Tong University. > Mobile: (86)188-1755-4825**** > > **** > > > > **** > > ** ** > > -- > Hongbin MA(马洪宾)**** > > Department of Computer Science and Engineering, > Shanghai Jiao Tong University. > Mobile: (86)188-1755-4825**** > > ** ** > -- Hongbin MA(马洪宾) Department of Computer Science and Engineering, Shanghai Jiao Tong University. Mobile: (86)188-1755-4825 -- Hongbin MA(马洪宾) Department of Computer Science and Engineering, Shanghai Jiao Tong University. Mobile: (86)188-1755-4825
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
