Thanks Tom - using the Moses version of symal rather than the MGiza version fixed it (although still not sure why it should be different to the one I built on my desktop). I hadn't realised they were different, as the instructions on the Moses website state you should copy all binaries from MGiza into the Moses directory: http://www.statmt.org/moses/?n=Moses.ExternalTools#ntoc3
Thanks for your help. James On Wed, 1 Aug 2018 at 02:14, Tom Hoar <[email protected]> wrote: > Hi James, > > Since train-model.perl fails at step 4 fails with the MGIZA binaries you > build on your AWS machine, but succeeds when you copy MGIZA binaries that > you built on your local Ubuntu 16.04 machine, do the build logs show a > missing dependency? > > My next question, why don't you just use the binaries that work? It seems > like the AWS machine's Ubuntu distro is missing dependencies and the > MGIZA++ build failed. Check those build logs. > > If you want to troubleshoot deeper, you need to backtrack from step 4. The > train-model.perl step 4 uses the output from step 3, i.e. the word > alignment file. Check if that word alignment file is corrupted. > > Then check step 3, its inputs are the GIZA alignment files output in step > 2. This step uses the symal binary executable. Make sure you're using the > Moses version of symal, not the one in the MGIZA library. > http://article.gmane.org/gmane.comp.nlp.moses.user/11544 > http://moses-support.mit.narkive.com/KpKC2TQn/which-symal > > Backtracking to step 2, log lines with the following text messages should > cause the mgiza executable to terminate but it doesn't. The parallel forks > in train-model.perl mask the failure, processing continues and you > experience ambiguous failures downstream. > > ERROR: A SOURCE or TARGET sentence has a zero-length sentence. > ERROR! DUPLICATED ENTRY > WARNING: The following sentence pair has source/target sentence length > ration more than > > There are rarely errors in Step 1, but if you are experiencing a compile > error on AWS, those MGIZA binaries in step 1 could be the cause. > > Also, the C++ binary executables are not the only things that change when > you use use the alternate build. If you also copied the merge_alignment.py, > this could be a problem in train-model.perl step 2. Make sure the AWS build > has this in the right place and that it runs on the AWS distro's Python > interpreter. > > Tom > > > > On 7/31/2018 11:01 PM, [email protected] wrote: > > Date: Tue, 31 Jul 2018 11:41:01 +0100 > From: James Baker <[email protected]> <[email protected]> > Subject: [Moses-support] Issues running MGiza on AWS machine > To: [email protected] > Message-ID: > <CAOa=L2woDwDETeeq7RAs0LxmnbAA_Q7qH=kf8apah7ivbk0...@mail.gmail.com> > <CAOa=L2woDwDETeeq7RAs0LxmnbAA_Q7qH=kf8apah7ivbk0...@mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hi, > > I'm having some peculiar issues with MGiza++. Using MGiza and Moses, I've > successfully built some translation models on my Ubuntu 16.04 desktop > machine. I'd now like to do the same thing, but on a machine hosted in AWS. > > I'm using the same operating system, and as far as I can tell all my > versions are identical. The build of MGiza++ runs fine, reports no errors, > and produces output the same as on my desktop machine. However, when I try > to build the models, I get a whole load of errors and the resultant models > are empty (64 bytes for the reordering model, 0 bytes for the translation > model - the language model builds fine). > > The first "errors" I can see in the log seem to occur on stage 4 of the > Moses training script (train-model.perl): > > (4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 UTC 2018 > (/opt/model-builder/training/data.ru > ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex) > !Argument "anna" isn't numeric in numeric ge (>=) at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 112, <A> line 1. > Use of uninitialized value $ei in numeric ge (>=) at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 112, <A> line 1. > Use of uninitialized value $ei in hash element at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 118, <A> line 1. > Use of uninitialized value $ei in array element at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 121, <A> line 1. > Use of uninitialized value $ei in array element at > /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm > line 123, <A> line 1. > ... > > There are a large number of errors of that nature, and following those > errors there are additional errors but I suspect these are caused by the > fact that this stage is failing. > > It's possible that there are earlier problems, but I'm not really sure what > to be looking for in the logs (for instance - there are some lines warning > about alignments in Model2 being 0 - is that an issue?). > > If I replace the MGiza binaries built on the AWS machine with the binaries > built on my desktop, it runs fine - so I know it's an issue with MGiza and > presumably something to do with my build. The commands I'm running to build > and install are as follows > > git clone https://github.com/moses-smt/mgiza.git > cd mgiza/mgizapp > cmake . > make > make install > cp bin/* ../../mosesdecoder/bin > cp scripts/merge_alignment.py ../../mosesdecoder/bin > > As I mentioned previously, these commands work fine on my desktop machine > which should be a very similar (if not identical) set up. > > Does anyone have any ideas as to what might be causing the problem (or, > more importantly, what I can do to fix it)? > > Thanks in advance, > James > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
