U've specified the target side in the phrase-table to have 2 factors
[ttable-file]
0 0,1 ....
Your phrase-table probably only has 1.
As nicola says, by default, only the 0th factor is output. If u want to
output both factors to compare to the refernce, put this in your ini file
too:
[output-factors]
0
1
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Philipp Koehn
Sent: 06 June 2008 12:35
To: Amit
Cc: moses-support
Subject: Re: [Moses-support] Mert for factored models
Hi,
does anybody have an idea what is going on here?
-phi
On Thu, Jun 5, 2008 at 9:24 PM, Amit <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am again sorry for bugging you. I looked at the config file and did
> some changes and nothing helped. To me the config file looks ok. Its
> almost similar to the one on factored model tutorial on SMT website.
> If it is possible and doesn't take much time of yours can you please
> have a look at the config file and can suggest the part which could go
> wrong. Its before mert where the system is dying.
>
> I am running standard script for tuning.
>
> perl
> /home/srini/MT/Translation/Moses/bin/moses-scripts/scripts-20071216-13
> 51/training/mert-moses.pl
> \
> $TUNING/input \
> $TUNING/reference \
> /home/srini/MT/Translation/Moses/bin/moses \ $MYDIR/model/moses.ini \
> --working-dir $TUNING \ --rootdir
> /home/srini/MT/Translation/Moses/bin/moses-scripts/scripts-20071216-13
> 51
>
> Error:
>
> Start loading LanguageModel
> /uusoc/scratch/rome/res/nlp/factored-model/factored-corpus/europarl.lm :
> [0.000] seconds
> Start loading LanguageModel
> /uusoc/scratch/rome/res/nlp/factored-model/factored-corpus/supertag.lm :
> [14.000] seconds
> Finished loading LanguageModels : [16.000] seconds Start loading
> PhraseTable
> /uusoc/scratch/rome/res/nlp/factored-model/tuning/filtered/phrase-tabl
> e.0-0,1.1
> : [16.000] seconds
> [ERROR] Malformed input at
> Expected input to have words composed of 2 factor(s) (form
> FAC1|FAC2|...) but instead received input with 1 factor(s).
>
> head -2 input
> sin embargo , señor presidente , también es realmente necesario que en
> biarritz se vaya un poco más lejos .
> nosotros , los representantes , tenemos al mismo tiempo el deber de
> estimular el progreso , a pesar de la adversidad , y de transmitir los
> mensajes que recibimos de la opinión pública en cada uno de nuestros
> países .
>
> head -2 reference
> what|WP i|NN would|MD also|RB call|VB for|IN ,|_COMMA_ however|RB
> what|,|_COMMA_
> is|VBZ to|TO look|VB beyond|IN immediate|JJ concerns|NNS in|IN
> is|biarritz|NN
> .|_PERIOD_
> we|PRP ,|_COMMA_ as|IN elected|VBN representatives|NNS ,|_COMMA_
> we|are|VBP
> at|IN least|JJS as|IN responsible|JJ for|IN encouraging|VBG it|PRP
> at|to|TO
> make|VB progress|NN in|IN the|DT face|NN of|IN adversity|NN as|IN
> make|we|PRP
> are|VBP for|IN relaying|VBG the|DT messages|NNS that|IN we|PRP
> are|receive|VBP
> from|IN public|JJ opinion|NN in|IN each|DT of|IN our|PRP$
> from|countries|NNS
> .|_PERIOD_
>
> Thanks,
> Amit
>
> Philipp Koehn wrote:
>>
>> Hi,
>>
>> there may be a few bugs in the current training script, so you should
>> check manually that the definition of factors in the configuration
>> file matches the data that you use as developments set. In your case,
>> you say that the input uses only one factor. Check, if the
>> configuration file also just specifies one factor.
>>
>> -phi
>>
>> On Tue, Jun 3, 2008 at 3:18 PM, Amit <[EMAIL PROTECTED]> wrote:
>>
>>>
>>> Hi Philipp,
>>>
>>> Sorry for bothering you. I am currently working with Srini at AT&T
>>> on SMT. I am trying to run mert for factored models. The way I am
>>> doing is the same as I did for unfactored models though now I am
>>> getting an error saying input to have words composed of 2 factor.
>>> However in my model on source side i have one factor and on target
>>> side I have two factors. So the above error doesn't make sense to
>>> me. Do I need to run mert differently or am I doing something
>>> stupid?
>>>
>>> Thanks,
>>> Amit
>>>
>>>
>>>
>
>
> #########################
> ### MOSES CONFIG FILE ###
> #########################
>
> # input factors
> [input-factors]
> 0
>
> # mapping steps
> [mapping]
> 0 T 0
>
> # translation tables: source-factors, target-factors, number of
> scores, file [ttable-file] 0 0,1 5
> /home/amitg/supertag-model/pos-model/model/phrase-table.0-0,1.gz
>
> # no generation models, no generation-file section
>
> # language models: type(srilm/irstlm), factors, order, file
> [lmodel-file] 0 0 5 /home/amitg/supertag-model/lm/english/europarl.lm
> 0 1 5 /home/amitg/supertag-model/pos-model/data/pos.lm
>
>
> # limit on how many phrase translations e for each phrase f are loaded
> # 0 = all elements loaded [ttable-limit] 20 0 # distortion
> (reordering) weight [weight-d]
> 0.6
>
> # language model weights
> [weight-l]
> 0.2500
> 0.2500
>
>
> # translation model weights
> [weight-t]
> 0.2
> 0.2
> 0.2
> 0.2
> 0.2
>
> # no generation models, no weight-generation section
>
> # word penalty
> [weight-w]
> -1
>
> [distortion-limit]
> 6
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
--
The University of Edinburgh is a charitable body, registered in Scotland,
with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support