Hi Hieu and all,

I just checked in a bug fix for the multi_moses.py script.  I forgot to
override the number of threads for each moses command, so if [threads] were
specified in the moses.ini, the multi-moses runs were cheating by running a
bunch of multi-threaded instances.  If threads were only being specified on
the command line, the script was correctly stripping the flag so everything
should be good.  I finished a benchmark on my system with an unpruned
compact PT (with the fixed script) and got the following:

16 threads 5.38 sent/sec
16 procs  13.51 sent/sec

This definitely used a lot more memory though.  Based on some very rough
estimates looking at free system memory, the memory mapped suffix array PT
went from 2G to 6G with 16 processes while the compact PT went from 3G to
37G.  For cases where everything fits into memory, I've seen significant
speedup from multi-process decoding.

For cases where things don't fit into memory, the multi-moses script could
be extended to start as many multi-threaded instances as will fit into ram
and farm out sentences in a way that keeps all of the CPUs busy.  I know
Marcin has mentioned using GNU parallel.

Best,
Michael

On Tue, Oct 6, 2015 at 4:16 PM, Hieu Hoang <[email protected]> wrote:

> I've just run some comparison between multithreaded decoder and the
> multi_moses.py script. It's good stuff.
>
> It make me seriously wonder whether we should use abandon multi-threading
> and go all out for the multi-process approach.
>
> There's some advantage to multi-thread - eg. where model files are loaded
> into memory rather than memory map. But there's disadvantages too - it more
> difficult to maintain and there's about a 10% overhead.
>
> What do people think?
>
> Phrase-based:
>
> 1 5 10 15 20 25 30 32 real    4m37.000s real    1m15.391s real
> 0m51.217s real    0m48.287s real    0m50.719s real    0m52.027s real
> 0m53.045s Baseline (Compact pt) user    4m21.544s user    5m28.597s user
> 6m38.227s user    8m0.975s user    8m21.122s user    8m3.195s user
> 8m4.663s
> sys     0m15.451s sys     0m34.669s sys     0m53.867s sys     1m10.515s
> sys     1m20.746s sys     1m24.368s sys     1m23.677s
>
>
>
>
>
>
>
> 34 4m49.474s real    1m17.867s real    0m43.096s real    0m31.999s
> 0m26.497s 0m26.296s killed (32) + multi_moses 4m33.580s user    4m40.486s
> user    4m56.749s user    5m6.692s 5m43.845s 7m34.617s
>
> 0m15.957s sys     0m32.347s sys     0m51.016s sys     1m11.106s 1m44.115s
> 2m21.263s
>
>
>
>
>
>
>
>
> 38 real    4m46.254s real    1m16.637s real    0m49.711s real    0m48.389s
> real    0m49.144s real    0m51.676s real    0m52.472s Baseline (Probing
> pt) user    4m30.596s user    5m32.500s user    6m23.706s user
> 7m40.791s user    7m51.946s user    7m52.892s user    7m53.569s
> sys     0m15.624s sys     0m36.169s sys     0m49.433s sys     1m6.812s sys
> 1m9.614s sys     1m13.108s sys     1m12.644s
>
>
>
>
>
>
>
> 39 real    4m43.882s real    1m17.849s real    0m34.245s real    0m31.318s
> real    0m28.054s real    0m24.120s real    0m22.520s (38) + multi moses
> user    4m29.212s user    4m47.693s user    5m5.750s user    5m33.573s
> user    6m18.847s user    7m19.642s user    8m38.013s
> sys     0m15.835s sys     0m25.398s sys     0m36.716s sys     0m41.349s
> sys     0m48.494s sys     1m0.843s sys     1m13.215s
> Hiero:
> 3 real    5m33.011s real    1m28.935s real    0m59.470s real    1m0.315s
> real    0m55.619s real    0m57.347s real    0m59.191s 1m2.786s 6/10
> baseline user    4m53.187s user    6m23.521s user    8m17.170s user
> 12m48.303s user    14m45.954s user    17m58.109s user    20m22.891s
> 21m13.605s
> sys     0m39.696s sys     0m51.519s sys     1m3.788s sys     1m22.125s sys
> 1m58.718s sys     2m51.249s sys     4m4.807s 4m37.691s
>
>
>
>
>
>
>
>
> 4
> real    1m27.215s real    0m40.495s real    0m36.206s real    0m28.623s
> real    0m26.631s real    0m25.817s 0m25.401s (3) + multi_moses
> user    5m4.819s user    5m42.070s user    5m35.132s user    6m46.001s
> user    7m38.151s user    9m6.500s 10m32.739s
>
> sys     0m38.039s sys     0m45.753s sys     0m44.117s sys     0m52.285s
> sys     0m56.655s sys     1m6.749s 1m16.935s
>
> On 05/10/2015 16:05, Michael Denkowski wrote:
>
> Hi Philipp,
>
> Unfortunately I don't have a precise measurement.  If anyone knows of a
> good way to benchmark a process tree with lots of memory mapping the same
> files, I would be glad to run it.
>
> --Michael
>
> On Mon, Oct 5, 2015 at 10:26 AM, Philipp Koehn <[email protected]> wrote:
>
>> Hi,
>>
>> great - that will be very useful.
>>
>> Since you just ran the comparison - do you have any numbers on "still
>> allowed everything to fit into memory", i.e., how much more memory is used
>> by running parallel instances?
>>
>> -phi
>>
>> On Mon, Oct 5, 2015 at 10:15 AM, Michael Denkowski <
>> <[email protected]>[email protected]> wrote:
>>
>>> Hi all,
>>>
>>> Like some other Moses users, I noticed diminishing returns from running
>>> Moses with several threads.  To work around this, I added a script to run
>>> multiple single-threaded instances of moses instead of one multi-threaded
>>> instance.  In practice, this sped things up by about 2.5x for 16 cpus and
>>> using memory mapped models still allowed everything to fit into memory.
>>>
>>> If anyone else is interested in using this, you can prefix a moses
>>> command with scripts/generic/multi_moses.py.  To use multiple instances in
>>> mert-moses.pl, specify --multi-moses and control the number of parallel
>>> instances with --decoder-flags='-threads N'.
>>>
>>> Below is a benchmark on WMT fr-en data (2M training sentences, 400M
>>> words mono, suffix array PT, compact reordering, 5-gram KenLM) testing
>>> default stack decoding vs cube pruning without and with the parallelization
>>> script (+multi):
>>>
>>> ---
>>> 1cpu   sent/sec
>>> stack      1.04
>>> cube       2.10
>>> ---
>>> 16cpu  sent/sec
>>> stack      7.63
>>> +multi    12.20
>>> cube       7.63
>>> +multi    18.18
>>> ---
>>>
>>> --Michael
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>
>
> _______________________________________________
> Moses-support mailing 
> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> --
> Hieu Hoanghttp://www.hoang.co.uk/hieu
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to