Hi Hieu

That's exactly why I took to pre-pruning the phrase table, as I mentioned on Friday. I had something like 750,000 translations of the most common word, and it took half-an-hour to get the first sentence translated.

cheers - Barry

On 05/10/15 15:48, Hieu Hoang wrote:
what pt implementation did you use, and had it been pre-pruned so that there's a limit on how many target phrase for a particular source phrase? ie. don't have 10,000 entries for 'the' .

I've been digging around multithreading in the last few weeks. I've noticed that the compact pt is VERY bad at handling unpruned pt.
                Cores                                   
                1       5       10      15      20      25
Unpruned        compact pt      143     42      32      38      52      62
        probing pt      245     58      33      25      24      21
Pruned  compact pt      119     24      15      10      10      10
        probing pt      117     25      25      10      10      10



Hieu Hoang
http://www.hoang.co.uk/hieu

On 5 October 2015 at 15:15, Michael Denkowski <[email protected] <mailto:[email protected]>> wrote:

    Hi all,

    Like some other Moses users, I noticed diminishing returns from
    running Moses with several threads.  To work around this, I added
    a script to run multiple single-threaded instances of moses
    instead of one multi-threaded instance.  In practice, this sped
    things up by about 2.5x for 16 cpus and using memory mapped models
    still allowed everything to fit into memory.

    If anyone else is interested in using this, you can prefix a moses
    command with scripts/generic/multi_moses.py.  To use multiple
    instances in mert-moses.pl <http://mert-moses.pl>, specify
    --multi-moses and control the number of parallel instances with
    --decoder-flags='-threads N'.

    Below is a benchmark on WMT fr-en data (2M training sentences,
    400M words mono, suffix array PT, compact reordering, 5-gram
    KenLM) testing default stack decoding vs cube pruning without and
    with the parallelization script (+multi):

    ---
    1cpu   sent/sec
    stack      1.04
    cube       2.10
    ---
    16cpu  sent/sec
    stack      7.63
    +multi    12.20
    cube       7.63
    +multi    18.18
    ---

    --Michael

    _______________________________________________
    Moses-support mailing list
    [email protected] <mailto:[email protected]>
    http://mailman.mit.edu/mailman/listinfo/moses-support




_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to