Hi Li

You're absolutely right, mgiza has gotten slower than giza++! I have mgiza from 2 years ago which is x2 faster on 3 cores, but now it's x2 slower.

Currently rolling back to find the offending commit. Will get back to you when I find it

These are the timings:
*CURRENT MGIZA**
*1. 25722.74user 904.54system 1:26:41elapsed 511%CPU (0avgtext+0avgdata 1906128maxresident)k 2. 24095.06user 978.64system 1:20:57elapsed 516%CPU (0avgtext+0avgdata 1906176maxresident)k

*GIZA++*
4902.41user 21.95system 43:54.45elapsed 186%CPU (0avgtext+0avgdata 1906144maxresident)k


*OLD **MGIZA*
6576.71user 570.62system 24:09.90elapsed 492%CPU (0avgtext+0avgdata 1906144maxresident)k


On 17/01/15 08:41, Li Xiang wrote:
Hi,

GIZA:
${mosesScript}/training/train-model.perl \
  --external-bin-dir "${binDir}" \
  --root-dir "${trainDir}"  \
  --corpus train \
  --f src \
  --e ref \
  --alignment grow-diag-final-and \
  --parallel \
  --first-step 1 \
  --last-step 3
MGIZA

${mosesScript}/training/train-model.perl \
  --external-bin-dir "${binDir}" \
  --root-dir "${trainDir}"  \
  --corpus train \
  --f src \
  --e ref \
  --alignment grow-diag-final-and \
  --parallel \
  --first-step 1 \
  --last-step 3 \
  --mgiza --mgiza-cpus 3


在 2015年1月17日,16:39,Hieu Hoang <[email protected] <mailto:[email protected]>> 写道:

ok, can u tell me what u ran for giza++ and mgiza

On 17 January 2015 at 08:29, Li Xiang <[email protected] <mailto:[email protected]>> wrote:

    Hi Hieu,

    I give you 5K training data for evaluate the performance. And I
    get similar result that mgiza is slower than giza on the data.


    在 2015年1月17日,00:34,Hieu Hoang <[email protected]
    <mailto:[email protected]>> 写道:

    can you provide the training corpus so I can verify your results?

    On 16 January 2015 at 15:53, Li Xiang <[email protected]
    <mailto:[email protected]>> wrote:

        Hi all,

        I trained the alignment model on the same data with the same
        parameters using GIZA and MGIZA respectively. The training
        corpus includes 200K sentences. My server has an Intel Quad
        CPU i4790K which has 4 cores and each core has 2 threads. It
        costs 2905 seconds for GIZA. But it costs 5259 seconds for
        MGIZA with 3 threads. I think MGIZA is much faster than
        GIZA. But I got bad result. I do not know the reason is the
        compile way or others.

        Does anyone has relative experience? Thanks.

        The following is the training command for MGIZA. And the
        training data is the FBIS zh-en data. But I can not public
        the data because of copyright.


        ${mosesScript}/training/train-model.perl \
         --external-bin-dir "${binDir}" \
         --root-dir "${trainDir}"  \
         --corpus train \
         --f src \
         --e ref \
         --alignment grow-diag-final-and \
         --parallel \
         --first-step 1 \
         --last-step 3 \
         --mgiza --mgiza-cpus 3
        _______________________________________________
        Moses-support mailing list
        [email protected] <mailto:[email protected]>
        http://mailman.mit.edu/mailman/listinfo/moses-support




-- Hieu Hoang
    Research Associate
    University of Edinburgh
    http://www.hoang.co.uk/hieu






--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to