Remember that features also have weights (in addition to their values). For
your constant feature of 1, the associated multiplier (set by MERT) should
ensure that Moses makes the correct decision w.r.t. unknown words etc
Miles
2008/8/12 Jean-Baptiste Fouet <[EMAIL PROTECTED]>
> Hi all, i am trying to use the Lattice decoding (with input-type=2 ie
> real word lattice not confusion network) and i think i identified some
> issues with the way unknown words are dealt with:
>
> If i understand correctly, the weigth associated to the penalty
> associated to a given edge of the lattice is weight-i, which is
> implemented as an additional translation table weight. But unknown
> words have ALL Translationweights set to 0, even that one, so the edge
> cost is discarded for an unknow word (ie translation generated by
> ProcessOneUnknownWord).
> This means that if presented with 2 edge with 2 different unknown words
> , the decoder will pick the first edge, not the one with the smallest cost
> Is that correct?
> i tested that with the ressource in
> mosesdecoder/regression-testing/tests/lattice-distortion and the lattice:
>
> ((('UW1',0.0,1),('UW2',1.0,1),),)
> result is UW1 which is not correct. (p=0.0 means a cost of -100, p=1.0
> means a cost of 0 so smallest cost should be UW2)
>
> A second problem is that the feature weight associated with the unknown
> word penalty can't be modified (always 1), so a not found word always
> have a score penaly of -100 in addition to the lm cost.
> This means that an edge with probability 0 (ie cost -100*weight-i + lm
> costs)
> labeled with a known word will always be prefered to an edge with
> probability 1 (ie 0 cost 0) labeled with a not found word
> (cost -100*1=-100+ lm costs). (unless weight-i is bigger than 1)
>
> Is that correct?
> tested with:
> ((('A',0.0,1),('UW',1.0,1),),)
> result is 1 (translation of A)
> which is not what i want
>
>
> If i am right i plan to correct it by:
>
> 1) applying the edge cost even for a nfw
> 2) introducing a weight-u option to tune the not foud word feature weight
>
> Does that seems ok? am I overlooking something ?
>
> Thanks
> JB Fouet
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
--
The University of Edinburgh is a charitable body, registered in Scotland,
with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support