Hi James,
Since train-model.perl fails at step 4 fails with the MGIZA binaries you
build on your AWS machine, but succeeds when you copy MGIZA binaries
that you built on your local Ubuntu 16.04 machine, do the build logs
show a missing dependency?
My next question, why don't you just use the binaries that work? It
seems like the AWS machine's Ubuntu distro is missing dependencies and
the MGIZA++ build failed. Check those build logs.
If you want to troubleshoot deeper, you need to backtrack from step 4.
The train-model.perl step 4 uses the output from step 3, i.e. the word
alignment file. Check if that word alignment file is corrupted.
Then check step 3, its inputs are the GIZA alignment files output in
step 2. This step uses the symal binary executable. Make sure you're
using the Moses version of symal, not the one in the MGIZA library.
http://article.gmane.org/gmane.comp.nlp.moses.user/11544
http://moses-support.mit.narkive.com/KpKC2TQn/which-symal
Backtracking to step 2, log lines with the following text messages
should cause the mgiza executable to terminate but it doesn't. The
parallel forks in train-model.perl mask the failure, processing
continues and you experience ambiguous failures downstream.
ERROR: A SOURCE or TARGET sentence has a zero-length sentence.
ERROR! DUPLICATED ENTRY
WARNING: The following sentence pair has source/target sentence
length ration more than
There are rarely errors in Step 1, but if you are experiencing a compile
error on AWS, those MGIZA binaries in step 1 could be the cause.
Also, the C++ binary executables are not the only things that change
when you use use the alternate build. If you also copied the
merge_alignment.py, this could be a problem in train-model.perl step 2.
Make sure the AWS build has this in the right place and that it runs on
the AWS distro's Python interpreter.
Tom
On 7/31/2018 11:01 PM, [email protected] wrote:
Date: Tue, 31 Jul 2018 11:41:01 +0100
From: James Baker<[email protected]>
Subject: [Moses-support] Issues running MGiza on AWS machine
To:[email protected]
Message-ID:
<CAOa=L2woDwDETeeq7RAs0LxmnbAA_Q7qH=kf8apah7ivbk0...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi,
I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've
successfully built some translation models on my Ubuntu 16.04 desktop
machine. I'd now like to do the same thing, but on a machine hosted in AWS.
I'm using the same operating system, and as far as I can tell all my
versions are identical. The build of MGiza++ runs fine, reports no errors,
and produces output the same as on my desktop machine. However, when I try
to build the models, I get a whole load of errors and the resultant models
are empty (64 bytes for the reordering model, 0 bytes for the translation
model - the language model builds fine).
The first "errors" I can see in the log seem to occur on stage 4 of the
Moses training script (train-model.perl):
(4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC 2018
(/opt/model-builder/training/data.ru
,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex)
!Argument "anna" isn't numeric in numeric ge (>=) at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 112, <A> line 1.
Use of uninitialized value $ei in numeric ge (>=) at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 112, <A> line 1.
Use of uninitialized value $ei in hash element at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 118, <A> line 1.
Use of uninitialized value $ei in array element at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 121, <A> line 1.
Use of uninitialized value $ei in array element at
/opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
line 123, <A> line 1.
...
There are a large number of errors of that nature, and following those
errors there are additional errors but I suspect these are caused by the
fact that this stage is failing.
It's possible that there are earlier problems, but I'm not really sure what
to be looking for in the logs (for instance - there are some lines warning
about alignments in Model2 being 0 - is that an issue?).
If I replace the MGiza binaries built on the AWS machine with the binaries
built on my desktop, it runs fine - so I know it's an issue with MGiza and
presumably something to do with my build. The commands I'm running to build
and install are as follows
git clonehttps://github.com/moses-smt/mgiza.git
cd mgiza/mgizapp
cmake .
make
make install
cp bin/* ../../mosesdecoder/bin
cp scripts/merge_alignment.py ../../mosesdecoder/bin
As I mentioned previously, these commands work fine on my desktop machine
which should be a very similar (if not identical) set up.
Does anyone have any ideas as to what might be causing the problem (or,
more importantly, what I can do to fix it)?
Thanks in advance,
James
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support