Thanks for the information Kevin. Where would I find these feature
weights? I've found files in Moses that I suspect might be the weights
but they're not labeled and the file/directory names don't really help
either.
--
Taylor Rose
Machine Translation Intern
Language Intelligence
IRC: Handle: trose
Server: freenode
On Tue, 2011-09-20 at 23:32 -0400, Kevin Gimpel wrote:
> Hey Taylor,
> Sounds like you are trying to come up with a simple heuristic for
> scoring phrase table entries for purposes of pruning. Many choices are
> possible here, so it's good to check the literature as folks mentioned
> above. But as far as I know there's no single optimal answer for this.
> Typically researchers try a few things and use the approach that gives
> the best results on the task at hand. But while there's no single
> correct answer, here are some suggestions:
> If you have trained weights for the features, you should definitely
> use those weights (as Miles suggested). So this would involve
> computing the dot product of the features and weights as follows:
> score(f, e) = \theta_1 * log(p(e | f)) + \theta_2 * log(lex(e | f)) +
> \theta_3 * log(p(f | e)) + \theta_4 * log(lex(f | e))
> where the thetas are the learned weights for each of the phrase table
> features.
> Note that the phrase table typically stores the feature values as
> probabilities, and Moses takes logs internally before computing the
> dot product. So you should take logs yourself before multiplying by
> the feature weights.
> If you don't have feature weights, using uniform weights is
> reasonable.
> And regarding your original question above: since the phrase penalty
> feature has the same value for all phrase pairs, it shouldn't affect
> pruning, right?
> HTH,
> Kevin
>
> On Tue, Sep 20, 2011 at 4:21 PM, Lane Schwartz <[email protected]>
> wrote:
> Taylor,
>
> If you don't have a background in NLP or CL (or even if you
> do), I
> highly recommend taking a look at Philipp's book "Statistical
> Machine
> Translation"
>
> I hope this doesn't come across as RTFM. That's not what I
> mean. :)
>
> Cheers,
> Lane
>
>
> On Tue, Sep 20, 2011 at 3:45 PM, Taylor Rose
> <[email protected]> wrote:
> > What would happen if I just multiplied the Direct Phrase
> Translation
> > probability φ(e|f) by the Direct Lexical weight Lex(e|f)?
> That seems
> > like it would work? Sorry if I'm asking dumb questions. I
> come from the
> > computational side of computational linguistics. I'm
> learning as fast as
> > I can.
> > --
> > Taylor Rose
> > Machine Translation Intern
> > Language Intelligence
> > IRC: Handle: trose
> > Server: freenode
> >
> >
> > On Tue, 2011-09-20 at 12:11 -0400, Burger, John D. wrote:
> >> Taylor Rose wrote:
> >>
> >> > So what exactly can I infer from the metrics in the
> phrase table? I want
> >> > to be able to compare phrases to each other. From my
> experience,
> >> > multiplying them and sorting by that number has given me
> more accurate
> >> > phrases... Obviously calling that metric "probability" is
> wrong. My
> >> > question is: What is that metric best indicative of?
> >>
> >> That product has no principled interpretation that I can
> think of. Phrase pairs with high values on all four features
> will obviously have high value products, but that's only
> interesting because all the features happen to be roughly
> monotonic in phrase quality. If you wanted a more principled
> way to rank the phrases, I'd just use the MERT weights for
> those features, and combine them with a dot product.
> >>
> >> Pre-filtering the phrase table is something lots of people
> have looked at, and there are many approaches to this. I like
> this paper:
> >>
> >> Improving Translation Quality by Discarding Most of the
> Phrasetable
> >> Johnson, John Howard; Martin, Joel; Foster, George; Kuhn,
> Roland
> >>
>
> http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/ctrl?action=shwart&index=an&req=5763542
> >>
> >> - JB
> >>
> >> > On Tue, 2011-09-20 at 16:14 +0100, Miles Osborne wrote:
> >> >> exactly, the only correct way to get real probabilities
> out would be
> >> >> to compute the normalising constant and renormalise the
> dot products
> >> >> for each phrase pair.
> >> >>
> >> >> remember that this is best thought of as a set of
> scores, weighted
> >> >> such that the relative proportions of each model are
> balanced
> >> >>
> >> >> Miles
> >> >>
> >> >> On 20 September 2011 16:07, Burger, John D.
> <[email protected]> wrote:
> >> >>> Taylor Rose wrote:
> >> >>>
> >> >>>> I am looking at pruning phrase tables for the
> experiment I'm working on.
> >> >>>> I'm not sure if it would be a good idea to include the
> 'penalty' metric
> >> >>>> when calculating probability. It is my understanding
> that multiplying 4
> >> >>>> or 5 of the metrics from the phrase table would result
> in a probability
> >> >>>> of the phrase being correct. Is this a good
> understanding or am I
> >> >>>> missing something?
> >> >>>
> >> >>> I don't think this is correct. At runtime all the
> features from the phrase table and a number of other features,
> some only available during decoding, are combined in an inner
> product with a weight vector to score partial translations. I
> believe it's fair to say that at no point is there an explicit
> modeling of "a probability of the phrase being correct", at
> least not in isolation from the partially translated
> sentence. This is not to say you couldn't model this
> yourself, of course.
> >> >>>
> >> >>> - John Burger
> >> >>> MITRE
> >> >>> _______________________________________________
> >> >>> Moses-support mailing list
> >> >>> [email protected]
> >> >>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >
> >> > _______________________________________________
> >> > Moses-support mailing list
> >> > [email protected]
> >> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >>
> >> _______________________________________________
> >> Moses-support mailing list
> >> [email protected]
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
>
>
>
>
> --
> When a place gets crowded enough to require ID's, social
> collapse is not
> far away. It is time to go elsewhere. The best thing about
> space travel
> is that it made it possible to go elsewhere.
> -- R.A. Heinlein, "Time Enough For Love"
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support