Re: [Moses-support] Need help for parallelisation in mosesserver

Tom Hoar Wed, 28 Dec 2016 18:42:46 -0800

Hi Shubham,

A far as I can tell, Mosesserver is beneficial when you want/need toparallelize per-segment pre- and post-processing to a different machinefrom the Mosesserver process. However, most (but not all) pre- andpost-processing impose minimal computational overhead. They typicallycan process an entire batch of sentences (documents even multipledocuments) in one thread in a fraction of the time to run Moses.

So, I'm not sure what you intend to gain by running Mosesserver. It allseems like a lot of unnecessary work. Why not encapsulate the standardMoses executable in one deamonized top-level process? Include your pre-and post- toolchains as sub-processes within the same daemon and usestandard pipes to/from Moses. Then use sockets (other) to pump docsin/out of the daemon. You have one consistent interface that runs thesame regardless of whether your batch has one sentence or 300. Moses'per-sentence multiprocessing kicks-in automatically.

Re "then it takes around 10 seconds to translate..." For a 10-tokensentence?! That extremely excessive. This link shows one of ourcustomer's experiments with his EN-RU Slate Desktop for Windows system(you'll need to register/log-in to see it but registration is free).Slate Desktop has a Moses kernel running native on Windows (i.e. noCYGWIN). It uses a similar technique I described above. The customer'sexperiments average 1.5 seconds per sentence for up to 40 tokens.


   http://support.pttools.net/support/discussions/topics/6000042166

These times include pre-/post-processing and MT connector overheadbetween memoQ and his engine. The SMT model uses the slower (now legacy)binrarized phrase/reordering tables and binarized KENLM. The newercompact tables would run faster, but there were complications makingthem work on Windows. The CAT tool feeds sentence-by-sentence. So, Mosesis running effectively single threaded. Linux performance is comparable,maybe a little (5%) faster. You're running 6-7 times slower! I expectyour performance bottleneck lies somewhere else.


Tom



On 12/29/2016 4:51 AM, [email protected] wrote:

Date: Thu, 29 Dec 2016 00:53:41 +0530
From: Shubham Khandelwal<[email protected]>
Subject: [Moses-support] Need help for parallelisation in mosesserver
To: moses-support<[email protected]>

Hello,

As mosesserver accepts only one sentence at a time. So I am creating one
another component in front of mosesserver to handle tokenisation, casing
and splitting taking care of parallelisation.

Following is my procedure to do it, let me know whether am I heading
correctly or not to do this:
*---*
*So suppose, if I have 5 different sentences (as a paragraph) to translate
at once (fr-en). So I will be creating mosesserver on 5 different ports
firstly and pass those 5 different sentences after doing parallely
tokenisaton, casing and splitting on those different ports and then
concatenate the output after recasing and detokenisation parallely. *
*--*
Let me know whether this is correct or not ? If no, then please suggest me
better solution to do this.

Also, I have one more question in this that if a sentence is composed of
around 10 words. Then when I pass this sentence to translate as follows:
-> ~/mosesdecoder/bin/mosesserver -f moses.ini  -threads 16  -b 0.000000001

then it takes around 10 seconds to translate. To make it fast, I can run
this on different ports but that is not a good idea I think, as splitting a
single sentence to multiple group of sentence and then translate them on
different ports separately, can give different meaning rather than
translate the whole single sentence at single port.
So basically, my doubt is how to make better splitting in such cases which
can take care of parallelisation aswell ?

-- Yours Sincerely, Shubham Khandelwal

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Need help for parallelisation in mosesserver

Reply via email to