What would happen if I just multiplied the Direct Phrase Translation
probability φ(e|f) by the Direct Lexical weight Lex(e|f)? That seems
like it would work? Sorry if I'm asking dumb questions. I come from the
computational side of computational linguistics. I'm learning as fast as
I can.
--
Taylor Rose
Machine Translation Intern
Language Intelligence
IRC: Handle: trose
Server: freenode
On Tue, 2011-09-20 at 12:11 -0400, Burger, John D. wrote:
> Taylor Rose wrote:
>
> > So what exactly can I infer from the metrics in the phrase table? I want
> > to be able to compare phrases to each other. From my experience,
> > multiplying them and sorting by that number has given me more accurate
> > phrases... Obviously calling that metric "probability" is wrong. My
> > question is: What is that metric best indicative of?
>
> That product has no principled interpretation that I can think of. Phrase
> pairs with high values on all four features will obviously have high value
> products, but that's only interesting because all the features happen to be
> roughly monotonic in phrase quality. If you wanted a more principled way to
> rank the phrases, I'd just use the MERT weights for those features, and
> combine them with a dot product.
>
> Pre-filtering the phrase table is something lots of people have looked at,
> and there are many approaches to this. I like this paper:
>
> Improving Translation Quality by Discarding Most of the Phrasetable
> Johnson, John Howard; Martin, Joel; Foster, George; Kuhn, Roland
>
> http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/ctrl?action=shwart&index=an&req=5763542
>
> - JB
>
> > On Tue, 2011-09-20 at 16:14 +0100, Miles Osborne wrote:
> >> exactly, the only correct way to get real probabilities out would be
> >> to compute the normalising constant and renormalise the dot products
> >> for each phrase pair.
> >>
> >> remember that this is best thought of as a set of scores, weighted
> >> such that the relative proportions of each model are balanced
> >>
> >> Miles
> >>
> >> On 20 September 2011 16:07, Burger, John D. <[email protected]> wrote:
> >>> Taylor Rose wrote:
> >>>
> >>>> I am looking at pruning phrase tables for the experiment I'm working on.
> >>>> I'm not sure if it would be a good idea to include the 'penalty' metric
> >>>> when calculating probability. It is my understanding that multiplying 4
> >>>> or 5 of the metrics from the phrase table would result in a probability
> >>>> of the phrase being correct. Is this a good understanding or am I
> >>>> missing something?
> >>>
> >>> I don't think this is correct. At runtime all the features from the
> >>> phrase table and a number of other features, some only available during
> >>> decoding, are combined in an inner product with a weight vector to score
> >>> partial translations. I believe it's fair to say that at no point is
> >>> there an explicit modeling of "a probability of the phrase being
> >>> correct", at least not in isolation from the partially translated
> >>> sentence. This is not to say you couldn't model this yourself, of course.
> >>>
> >>> - John Burger
> >>> MITRE
> >>> _______________________________________________
> >>> Moses-support mailing list
> >>> [email protected]
> >>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>>
> >>>
> >>
> >>
> >>
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support