Assuming your input, "由于时间因素至关重要" contains more than one word, it
looks like you have not word-segmented/tokenized your Chinese phrases.
If you trained your corpus without segmentation, Moses/GIZA++ will treat
your Chinese source langauge as though each sentence has one word. This
would cause the results you're getting. Try using the Stanford Segmenter
http://nlp.stanford.edu/projects/chinese-nlp.shtml [1], to
segment/tokenize your Chinese half and re-trainin. 

If you have
segmented your Chinses half, then is it possible that you have
accidentially trained your model with English as the source language and
Chinese as target? 

On Wed, 16 May 2012 18:49:25 +0800, 马洪宾  wrote: 


is it because that my training corpus is too small? 
For performance I
only use 90000 sentences. 

I chose those in the phrase-table to check
it out, like "我" and "我国政府" 
and when I try "我" it can translate to "I",

but it still can't translate "我国政府" (even if it's in the phrase table)


is it normal at all? 

thanks! 

---------- Forwarded message
----------
From: 马洪宾 
Date: Wed, May 16, 2012 at 6:22 PM
Subject: Re:
答复: 答复: [Moses-support] UPDATED: moses training error
To:
[email protected] [3]

Hey, guys, 

I believe my previous problem
was caused by some noise in my corpus. 
I've tackled it now. 

Now I've
passed the training process, (no tuning yet), But I've got a moses.ini
in my train/model/ directory anyway. 

I use this moses.ini to run a
test(according to the official tutorial, this should make sense) 


hongbin@ubuntu:~/working1/train/model$ echo
"由于时间因素至关重要"|~/mosesdecoder/dist/bin/moses -f moses.ini 
Defined
parameters (per moses.ini or switch): 
 config: moses.ini 

distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6
/home/hongbin/working1/train/model/reordering-table.wbe-msd-bidirectional-fe.gz

 distortion-limit: 6 
 input-factors: 0 
 lmodel-file: 8 0 3
/home/hongbin/lm/corpus.blm.en 
 mapping: 0 T 0 
 ttable-file: 0 0 0 5
/home/hongbin/working1/train/model/phrase-table.gz 
 ttable-limit: 20 

weight-d: 0.3 0.3 0.3 0.3 0.3 0.3 0.3 
 weight-l: 0.5000 
 weight-t:
0.20 0.20 0.20 0.20 0.20 
 weight-w: -1 
Loading lexical distortion
models...have 1 models 
Creating lexical reordering... 
weights: 0.300
0.300 0.300 0.300 0.300 0.300 
Loading table into memory...done. 
Start
loading LanguageModel /home/hongbin/lm/corpus.blm.en : [72.000] seconds

Finished loading LanguageModels : [73.000] seconds 
Start loading
PhraseTable /home/hongbin/working1/train/model/phrase-table.gz :
[73.000] seconds 
filePath:
/home/hongbin/working1/train/model/phrase-table.gz 
Finished loading
phrase tables : [73.000] seconds 
Start loading phrase table from
/home/hongbin/working1/train/model/phrase-table.gz : [73.000] seconds

Reading /home/hongbin/working1/train/model/phrase-table.gz

----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100

****************************************************************************************************

Finished loading phrase tables : [108.000] seconds 
IO from
STDOUT/STDIN 
Created input-output object : [108.000] seconds

Translating line 0 in thread id 140004348253952 
Translating:
由于时间因素至关重要 

Collecting options took 0.000 seconds 
Search took 0.000
seconds 
由于时间因素至关重要 
BEST TRANSLATION: 由于时间因素至关重要|UNK|UNK|UNK [1]
[total=-104.508]  0-0 
Translation took 0.000 seconds 
Finished
translating 

It seems that it has not even tried to translate from
chinese to engish! 
what's wrong with this?I checked those phase-table
and language model file, it seems to be normal. 

could you please help
me on this? 

Hongbin 

On Wed, May 16, 2012 at 1:13 PM, lixianhua 
wrote:

There's a clean-corpus-n.perl in moses, find it and clean your
corpus like:  

./clean-corpus-n.perl corpus l1 l2 clean-corpus 1 100


发件人: 马洪宾 [mailto:[email protected] [5]] 
发送时间: 2012年5月16日 13:09
收件人:
lixianhua
主题: Re: 答复: [Moses-support] UPDATED: moses training error 

I
think you're right, do you have any batch to run the cleaning? 

On Wed,
May 16, 2012 at 12:10 PM, lixianhua  wrote: 

There must be something
wrong with your extract process 

I suggest cleaning your corpus, as
well as deleting the | [ ] characters in your corpus 

Then run the
train script 

发件人: [email protected] [7]
[mailto:[email protected] [8]] 代表 马洪宾
发送时间: 2012年5月16日
11:28
收件人: [email protected] [9] 

-- 
Hongbin MA(马洪宾) 

Department
of Computer Science and Engineering,
Shanghai Jiao Tong
University.
Mobile: (86)188-1755-4825  

Hi, 

I'm trying out a
chinese-english baseline system using the latest moses.  

I'm running
it on a Ubuntu server 64bit.  

Although I followed strictly to the
tutorial http://www.statmt.org/moses/?n=Moses.Baseline [10], when I'm
proceding the phrase " training the translation system", I get the info 


"ERROR: train/model/extract.o.sorted.gz does not exist in
~/working/train/model" and the program exit with exit code 2. 


However, I do find that there's a file named extract.sorted.gz in
~/working/train/model.(slightly different, not o.sorted.gz, but
sorted.gz)  

$ls -l :  

-rw-rw-r-- 1 hongbin hongbin 30674272 May 15
16:08 aligned.grow-diag-final-and  

-rw-rw-r-- 1 hongbin hongbin 20 May
15 16:10 extract.inv.sorted.gz  

-rw-rw-r-- 1 hongbin hongbin 20 May 15
16:10 extract.sorted.gz(but the size seems to be too small) 


-rw-rw-r-- 1 hongbin hongbin 61246318 May 15 16:10 lex.e2f 


-rw-rw-r-- 1 hongbin hongbin 61246318 May 15 16:10 lex.f2e 


-rw-rw-r-- 1 hongbin hongbin 2 May 15 16:10 phrase-table.gz   

Could
you please give me any clew to fix this?  

PS,  

I'm running this step
by:  

nohup nice ~/mosesdecoder/dist/training/train-model.perl
-root-dir train -corpus ~/corpus/corpus-clean -f ch -e en -alignment
grow-diag-final-and -reordering msd-bidirectional-fe -lm
0:3:$HOME/lm/corpus.blm.en:8 >& training.out &  

(Any problem with this
command?)  

Thanks!  

Hongbin  

-- 
Hongbin MA(马洪宾) 

Department of
Computer Science and Engineering,
Shanghai Jiao Tong University.
Mobile:
(86)188-1755-4825 


_______________________________________________
Moses-support mailing
list
[email protected]
[11]
http://mailman.mit.edu/mailman/listinfo/moses-support [12]      


主题: [Moses-support] UPDATED: moses training error                

--

Hongbin MA(马洪宾) 

Department of Computer Science and
Engineering,
Shanghai Jiao Tong University.
Mobile: (86)188-1755-4825 


-- 
Hongbin MA(马洪宾) 
Department of Computer Science and
Engineering,

Shanghai Jiao Tong University.
Mobile: (86)188-1755-4825


 -- 
Hongbin MA(马洪宾) 
Department of Computer Science and
Engineering,

Shanghai Jiao Tong University.
Mobile: (86)188-1755-4825 

                   

Links:
------
[1]
http://nlp.stanford.edu/projects/chinese-nlp.shtml
[2]
mailto:[email protected]
[3] mailto:[email protected]
[4]
mailto:[email protected]
[5] mailto:[email protected]
[6]
mailto:[email protected]
[7]
mailto:[email protected]
[8]
mailto:[email protected]
[9]
mailto:[email protected]
[10]
http://www.statmt.org/moses/?n=Moses.Baseline
[11]
mailto:[email protected]
[12]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to