Thanks Hieu and Christophe,

My intention behind the question is to understand how things actually work now. I'm not trying to steer how it works one way or another. Also, I would expect that tokens in the forced translations that are OOV in the parallel corpus and/or LM corpus would cause unpredictable randomized output. So, let's not worry about those cases.

That said, let's look at a hypothetical example. What if we're translating EN-ES and a hypothetical SMT model (phrase table, distortion table and language model) are all working with in-vocab tokens. The normal and correct input/output would be:

% echo 'the fat black cat sleeps' | moses -f moses.ini
el gran gato negro dureme

Then, we want to force the translation to this incorrect output:

% echo 'the <xml translation="gran negro gato">fat black cat</xml> sleeps' | moses -f moses.ini -xml-input exclusive
el gran negro gato dureme

Is there a chance that the distortion model could override the forced/incorrect translation and cause the first/correct output? Would adding the XML attribute prob="1.00000" force the intended translation regardless of the distortion table and language model?

Tom


On 05/01/2014 09:37 PM, Hieu Hoang wrote:
I'm not sure.

Ideally IMO, the reordering model should be used, even if the translation comes from XML. The reordering model just gives a score to the translation, just like any other feature function eg. LM, word penalty.

However, there might be an optimization where the reordering model is cached with the phrase-table. So if a rule is used multiple times, the reodering model only need to be looked up once. The optimization might have forgotten about XML, OOV etc.

Please let me know what you find out, and if it's important to you to have it 1 way or the other.


On 1 May 2014 11:10, Christophe Servan <[email protected] <mailto:[email protected]>> wrote:

    As far as I understand, if the phrase table is ignored, the
    reordering model is ignored too.
    Maybe someone like Hieu can answer this specific point more precisely.


    2014-05-01 11:44 GMT+02:00 Tom Hoar
    <[email protected]
    <mailto:[email protected]>>:

        Yes, the link's descriptions are good explanations of the
        differences between "exclusive", "inclusive", constraint",
        ignore" and "pass-through." All descriptions, however, refer
        to the "phrase table" (t-table), which to my understanding
        does not include the distortion table. For example,
        "exclusive" says, "Any phrases from the phrase table that
        overlap with that span are ignored." There is no information
        about the effects of the reordering/distortion table.




        On 05/01/2014 02:33 PM, Christophe Servan wrote:
        Hi Tom,
        As far as I understand there are diffrent ways to use it very
        well explained there:
        http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc11
        Now ansering your questions,

            I have two questions about how the -xml-input entries in
            the markup tag work.

             1. Are the entries applied before the distortion table
                with the distortion table affecting the result, or
                are the entries applied as a net effect after the
                distortion table?

        To me, it is another kind of decoding process, especially
        when the switch and the option "exclusive" are used. They
        seem to bypass completely the decoding process which uses the
        phrase-table and the distortion model.
        If you want to simply add a new translation hypothesis which
        does not already exist in your phrase table, use the
        "inclusive" option instead of "exclusive". Both processes
        will be use at the same time.
        But you will have no guaranty that your translation
        hypothesis, proposed with the xml-markup switch, is the one
        chosen by the decoder.
        As far as I know, the probability you gave in the xml tag
        correspond jointly to all the features and weights associated
        to the hypothesis.

             1. Do the entries override or supplement the weightings
                in the loaded SMT model's t-table/distortion table
                combination?

        As I said, as far as I know, the exclusive mode, simply
        override the phrase-table and the distortion model, if you
        still want use them, you can use the mode "inclusive" for
        example.

        Best,

        Christophe


        _______________________________________________
        Moses-support mailing list
        [email protected]  <mailto:[email protected]>
        http://mailman.mit.edu/mailman/listinfo/moses-support


        _______________________________________________
        Moses-support mailing list
        [email protected] <mailto:[email protected]>
        http://mailman.mit.edu/mailman/listinfo/moses-support



    _______________________________________________
    Moses-support mailing list
    [email protected] <mailto:[email protected]>
    http://mailman.mit.edu/mailman/listinfo/moses-support




--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to