On Wednesday 06 July 2011 03:33, Hwidong Na wrote:
> Hi Barry,
>
> Thank you for pointing out the location of the logic. I understand that
> we can safely discard the worse hypothesis if the 1-best case.
>
> However, if we seek n-best tranlations, it would be problematic to
> recombine different translations and make the score of worse hypothesis
> same as the better one. For example, as in the following slide (p.24):

But if we're seeking nbest translations, we keep the worse hypothesis in the 
ArcList of the hypothesis. We don't change its score.

>
> http://www.inf.ed.ac.uk/teaching/courses/mt/lectures/phrase-decoding.pdf
>
> "did not give" from "Joe" is the worse hypothesis than "did not give"
> from "Mary". After hypothesis recombination, the path "Joe did not give"
> can survive even though we have a better hypothesis that covers 3
> foreign words. What if it is worse than "Mary did not give" and pruned
> out for this reason?

In the slide there are three hypotheses with the following probabilities

[John] [did not give]  (0.001564)
[Mary] [did not give] (0.049128)
[Mary] [did not] [give] (0.008056992)

All these hypotheses will be in the same stack, and assuming that the LM order 
is <= 4, they can be recombined, keeping the second one. If we're doing 
nbest, then the second hypothesis will retain pointers to the other ones in 
its ArcList, and they will retain their original scores. The nbest logic is 
in Manager and TrellisPath.

I should emphasise that there is a difference between recombination 
(discarding hypotheses when we know it is safe to do so) and pruning 
(discarding hypotheses when the stacks get too big). The former does not 
create search errors, but the latter does. (slide 21)

>
> I'm not sure it would be problematic to recombine identical translations
> if we seek n-best translatins. Does it lead too many identical
> translations in the n-best list?
>

Well you do get identical translations in the nbest list (unless you ask for 
distinct ones) but I don't know what proportion of duplicates is normal. I 
guess it depends very much on your phrase table.

Hope that helps - best regards - Barry

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to