thanks Germán

i know this is just warning, but the problem after this warning is conversion 
process stoped.

the size of the phrasetable is 350MB, whereas the binary-phrasetable is 170MB?



 
> Date: Wed, 15 Sep 2010 08:44:40 +0200
> From: [email protected]
> To: [email protected]
> CC: [email protected]
> Subject: Re: [Moses-support] phrasetable-binary
> 
> Dear Musa,
> 
> As the message itself says, that is not an error, but a warning, and in 
> fact it is quite common to get that warning, as far as I know.
> 
> What that warning is actually saying is that there are source words in 
> your test set which do not appear to have a possible translation within 
> your phrase table, i.e. those source words in the test set will be 
> considered as unknown by the translation system. This problem is quite 
> common, specially if you are trying to translate a test set whose origin 
> is very different to the training data.
> 
> I hope that explains the message.
> 
> Best,
> 
> Germán Sanchis-Trilles
> 
> 
> 
> 
> On Wed, 15 Sep 2010, musa ghurab wrote:
> 
> > hi
> > I have problem when converting phrase-table.gz to hard disk binary image. i 
> > got the following error:
> >  
> > [m...@ibb]# gzip -cd work/20100914/model/phrase-table.gz | LC_ALL=C sort | 
> > nlp/moses/misc/processPhraseTable -ttable 0 0 - -nscores 5 -out
> > work/20100914/binary/model/phrase-table
> > processing ptree for stdin
> > ..................................................[phrase:500000]
> > ..........................distinct source phrases: 762319 distinct first 
> > words of source phrases: 11727 number of phrase pairs (line count): 3639432
> > WARNING: there are src voc entries with no phrase translation: count 1156
> > There exists phrase translations for 10571 entries
> >  
> >  
> > i checked the line by the following command, and it seems to be ok.
> > 
> >  
> > [m...@ibb]# gzip -cd work/20100914/model/phrase-table.gz | sed -n 
> > '1150,1160p'
> >  
> >  
> > Then i removed 10 lines from 1150-1160, and the problem still exist
> >  
> >  
> > [m...@ibb]# LC_ALL=C sort | nlp/moses/misc/processPhraseTable -ttable 0 0 - 
> > -nscores 5 -out work/20100914/binary/model/phrase-table <
> > work/20100914/model/phrase-table.cleaned
> > processing ptree for stdin
> > ..................................................[phrase:500000]
> > ..........................distinct source phrases: 762319 distinct first 
> > words of source phrases: 11727 number of phrase pairs (line count): 3639423
> > WARNING: there are src voc entries with no phrase translation: count 1156
> > There exists phrase translations for 10571 entries
> > 
> >
                                          
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to