Hmm, I'm pretty sure the
[output-factors]
0
1
works. Put it in the ini file, rather than the command line, to avoid
confusion.
I tried it on a test example, I get the following as output
i|PRO am|VB buying|VB you|PRO a|ART aardvark|NN
And the following in the n-best file
0 ||| i|PRO am|VB buying|VB you|PRO a|ART aardvark|NN ...
0 ||| i|PRO am|VB buying|VB you|PRO aardvark|NN a|ART ...
I'm not sure what the report-all-factors arg does, it's a bit confusing to
have that in the function
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Sara Stymne
Sent: 05 March 2008 12:25
To: [email protected]
Subject: Re: [Moses-support] Factored decoding output
Hi!
I also tried to do this, and it seems that the output-factors flag does not
work.
However, I found another option, the flag -report-all-factors. If it is
used, all possible output factors are given.
It does not work for n-best lists, however. I looked around in the code, and
in the function IOStream::OutputNBestList, the following call is made for
each hypothesis:
OutputSurface(*m_nBestStream, edge.GetCurrTargetPhrase(),
m_outputFactorOrder, false); // false for not reporting all factors
As the source code comment says, the decision is all ready made not to allow
factors in nbest-lists. What is the reason for this? As I see it it could be
quite useful for post-processing for instance.
It is quite straightforward to change this behavior by passing on the
boolean reportAllFactors, instead of always false. Thus allowing a choice of
having factors also in the nbest-list.
/Sara
Hieu Hoang skrev:
> hi vivek
>
> by default, the output factor is only 0. u can change that by putting
> this in your ini file
> [output-factors]
> 0
> 1
> I don't think anyone's ever tested it so there may be some bugs in there.
> Please tell me if there is & I'll fix it. If u manage to fix it
> yourself, please send me the patch ________________________________
>
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]
> On Behalf Of Vivek Rangarajan
> Sent: 28 February 2008 03:12
> To: [email protected]
> Subject: [Moses-support] Factored decoding output
>
>
> Hello all,
> I am trying to perform factored decoding and would like
> my final hypotheses to be a joint token (surface form+factor). The
> following is my moses.ini file ######################### ### MOSES
> CONFIG FILE ### ######################### # input factors
> [input-factors] 0 # mapping steps [mapping] 0 T 0 # translation
> tables: source-factors, target-factors, number of scores, file
> [ttable-file] 0 0,1 5
> /home/ENRICHED_TRANSLATION/Moses/words+accent/model/phrase-table.0-0,1
> .gz # no generation models, no generation-file section # language
> models: type(srilm/irstlm), factors, order, file [lmodel-file] 0 0 3
> /home/ENRICHED_TRANSLATION/Moses/words+accent/lm/train.en.lm
> 0 1 6 /home/ENRICHED_TRANSLATION/Moses/words+accent/lm/train.accent.lm
> AND SO ON...
>
> The phrase translation table does contain 0-0,1 pairs. However, when I
> decode using ~/moses/bin/moses -config ./model/moses.ini -input-file
> test.file > XX The XX output contains only surface form; doesn't
> contain the surface
> form+factor. The BEST TRANSLATION that the decoder produces however
> form+consists
> of the surface form+factor. It is pretty straightforward to grep for
> the output from the BEST TRANSLATION. However, I would like to
> understand why the output file dumped does not contain the joint token.
Any pointers ?
>
> Best
> Vivek
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support