Very bad unpruned and with mulithreading! :)

Is this with the nonblockpt branch? I am slowly running out of ideas
what might be the cause of this. Frequent vector realloaction?


On 05.10.2015 16:48, Hieu Hoang wrote:
> what pt implementation did you use, and had it been pre-pruned so that 
> there's a limit on how many target phrase for a particular source 
> phrase? ie. don't have 10,000 entries for 'the' .
>
> I've been digging around multithreading in the last few weeks. I've 
> noticed that the compact pt is VERY bad at handling unpruned pt.
>               Cores                                   
>               1       5       10      15      20      25
> Unpruned      compact pt      143     42      32      38      52      62
>       probing pt      245     58      33      25      24      21
> Pruned        compact pt      119     24      15      10      10      10
>       probing pt      117     25      25      10      10      10
>
>
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
> On 5 October 2015 at 15:15, Michael Denkowski 
> <[email protected] <mailto:[email protected]>> 
> wrote:
>
>     Hi all,
>
>     Like some other Moses users, I noticed diminishing returns from
>     running Moses with several threads.  To work around this, I added
>     a script to run multiple single-threaded instances of moses
>     instead of one multi-threaded instance.  In practice, this sped
>     things up by about 2.5x for 16 cpus and using memory mapped models
>     still allowed everything to fit into memory.
>
>     If anyone else is interested in using this, you can prefix a moses
>     command with scripts/generic/multi_moses.py.  To use multiple
>     instances in mert-moses.pl <http://mert-moses.pl>, specify
>     --multi-moses and control the number of parallel instances with
>     --decoder-flags='-threads N'.
>
>     Below is a benchmark on WMT fr-en data (2M training sentences,
>     400M words mono, suffix array PT, compact reordering, 5-gram
>     KenLM) testing default stack decoding vs cube pruning without and
>     with the parallelization script (+multi):
>
>     ---
>     1cpu   sent/sec
>     stack      1.04
>     cube       2.10
>     ---
>     16cpu  sent/sec
>     stack      7.63
>     +multi    12.20
>     cube       7.63
>     +multi    18.18
>     ---
>
>     --Michael
>
>     _______________________________________________
>     Moses-support mailing list
>     [email protected] <mailto:[email protected]>
>     http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to