I am not clear with the syntax of filter-model-given-input.pl
target-dir = where we want the filtered PT to go ?
moses.ini = if it is not in the above directory the script does accept it
input.txt = ? what is it in the case I just want to adjust the MinScore ?


Le 31/08/2015 16:54, Philipp Koehn a écrit :
Hi,

the filter script filters an existing phrase table. With the EMS settings, it would build another phrase table.

Don't worry about the reordering table. It will have excess entries, but they will not be used. If you really care, you can used the script scripts/training/remove-orphan-phrase-pairs-from-reordering-table.perl

-phi

On Mon, Aug 31, 2015 at 10:50 AM, Vincent Nguyen <[email protected] <mailto:[email protected]>> wrote:


    thanks, will try and post results.
    just to be clear:
    I can re-use the previous extract file
    I have to rebuild the phrase-table with new min score (ie no way
    to just filter the previous one ?)
    do I have to rebuild the reordering table too ?

    Vincent


    Le 31/08/2015 16:44, Philipp Koehn a écrit :
    hI,

    0.0001 should have no impact on translation quality,
    0.001 will have some impact
    0.01 is probably a bit too drastic.

    But that's the range you should explore.

    -phi

    On Mon, Aug 31, 2015 at 10:33 AM, Vincent Nguyen <[email protected]
    <mailto:[email protected]>> wrote:

        is there any benchmark on what value / what impact ?
        what should I start with as a test 0.001 ?

        the standard value 0.0001 seems really really low to me ....
        maybe I am not getting what this probability exactly refers to.



        where |FIELDn| is the position of the score (typically 2 for
        the direct phrase probability p(e|f), or 0 for the indirect
        phrase probability p(f|e)) and |THRESHOLD| the maximum
        probability allowed. A good setting is |2:0.0001|, which
        removes all rules, where the direct phrase translation
        probability is below 0.0001.



        Le 31/08/2015 16:14, Philipp Koehn a écrit :
        Hi,

        I would suspect that with beam sizes <500 the bulk of the
        time is
        spent on translation option collection, not decoding. You
        could speed
        that up with tighter threshold pruning of the phrase table.

        See the script scripts/training/threshold-filter.perl or the
        setting
        score-settings = "--MinScore 2:0.0001"
        in EMS.

        -phi

        On Mon, Aug 31, 2015 at 3:03 AM, Vincent Nguyen
        <[email protected] <mailto:[email protected]>> wrote:

            Hi,

            Here are some results with several values with cube
            pruning pop limit :

            (pop limit / decoding time for 3000 sentences / BLEU score)

            5000 - 15m45 - 29.59
            1000 - 4m27 - 29.59
            500 - 3m35 - 29.59
            200 - 3m15 - 29.51
            100 - 3m00 - 29.40

            Therefore I took 400 - 3m19 - 29.58

            If I am not mistaken the default value for Moses is 1000
            [read in the
            doc] but in the EMS
            it is 5000 right now .... which makes the experience so
            long .....
            I suggest to change the EMS default value.

            Is there a way to also use a cube pruning limit in the
            decoder at Tuning
            time ?

            Now with this optimized setting I get a ration of 15
            segments per second
            in average.
            What is the reason for online tools like Google / Bing
            to be much much
            faster.
            it's not a machine issue, is it ?


            Cheers
            Vincent

            _______________________________________________
            Moses-support mailing list
            [email protected] <mailto:[email protected]>
            http://mailman.mit.edu/mailman/listinfo/moses-support







_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to