What would happen if I just multiplied the Direct Phrase Translation
probability φ(e|f) by the Direct Lexical weight Lex(e|f)? That seems
like it would work? Sorry if I'm asking dumb questions. I come from the
computational side of computational linguistics. I'm learning as fast as
I can.
-- 
Taylor Rose
Machine Translation Intern
Language Intelligence
IRC: Handle: trose
     Server: freenode


On Tue, 2011-09-20 at 12:11 -0400, Burger, John D. wrote:
> Taylor Rose wrote:
> 
> > So what exactly can I infer from the metrics in the phrase table? I want
> > to be able to compare phrases to each other. From my experience,
> > multiplying them and sorting by that number has given me more accurate
> > phrases... Obviously calling that metric "probability" is wrong. My
> > question is: What is that metric best indicative of?
> 
> That product has no principled interpretation that I can think of.  Phrase 
> pairs with high values on all four features will obviously have high value 
> products, but that's only interesting because all the features happen to be 
> roughly monotonic in phrase quality.  If you wanted a more principled way to 
> rank the phrases, I'd just use the MERT weights for those features, and 
> combine them with a dot product.
> 
> Pre-filtering the phrase table is something lots of people have looked at, 
> and there are many approaches to this.  I like this paper:
> 
>   Improving Translation Quality by Discarding Most of the Phrasetable
>   Johnson, John Howard; Martin, Joel; Foster, George; Kuhn, Roland
>   
> http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/ctrl?action=shwart&index=an&req=5763542
> 
> - JB
> 
> > On Tue, 2011-09-20 at 16:14 +0100, Miles Osborne wrote:
> >> exactly,  the only correct way to get real probabilities out would be
> >> to compute the normalising constant and renormalise the dot products
> >> for each phrase pair.
> >> 
> >> remember that this is best thought of as a set of scores, weighted
> >> such that the relative proportions of each model are balanced
> >> 
> >> Miles
> >> 
> >> On 20 September 2011 16:07, Burger, John D. <[email protected]> wrote:
> >>> Taylor Rose wrote:
> >>> 
> >>>> I am looking at pruning phrase tables for the experiment I'm working on.
> >>>> I'm not sure if it would be a good idea to include the 'penalty' metric
> >>>> when calculating probability. It is my understanding that multiplying 4
> >>>> or 5 of the metrics from the phrase table would result in a probability
> >>>> of the phrase being correct. Is this a good understanding or am I
> >>>> missing something?
> >>> 
> >>> I don't think this is correct.  At runtime all the features from the 
> >>> phrase table and a number of other features, some only available during 
> >>> decoding, are combined in an inner product with a weight vector to score 
> >>> partial translations.  I believe it's fair to say that at no point is 
> >>> there an explicit modeling of "a probability of the phrase being 
> >>> correct", at least not in isolation from the partially translated 
> >>> sentence.  This is not to say you couldn't model this yourself, of course.
> >>> 
> >>> - John Burger
> >>> MITRE
> >>> _______________________________________________
> >>> Moses-support mailing list
> >>> [email protected]
> >>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>> 
> >>> 
> >> 
> >> 
> >> 
> > 
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to