I'm out of ideas. I've used aws and azure many times so it should work
Hieu Hoang Sent while bumping into things On Wed, 1 Aug 2018, 20:01 James Baker, <[email protected]> wrote: > Alas, nothing erroneous that I can see in the logs (using > ./train-model.perl > output.log 2>&1), and neither the memory usage nor the > used disk space went over 10% during the training. > > James > > On Wed, 1 Aug 2018 at 08:56, Hieu Hoang <[email protected]> wrote: > >> redirect stdout and stderr into a file and grep for 'error' >> >> that usually turns up something >> >> Hieu Hoang >> http://statmt.org/hieu >> >> On 1 August 2018 at 17:38, James Baker <[email protected]> wrote: >> >>> Thanks Hieu, >>> >>> I'll give that a go this morning and keep an eye on the disk space and >>> RAM, although I would be surprised if that was the problem (I've got <3GB >>> of training data, 64GB of RAM, and 100GB of disk space). It also wouldn't >>> explain why binaries built on a different machine work, but binaries built >>> on the same machine don't. >>> >>> Any other ideas for things I should be checking? >>> >>> Cheers, >>> James >>> >>> On Wed, 1 Aug 2018 at 03:03, Hieu Hoang <[email protected]> wrote: >>> >>>> it's difficult to tell but I would say the mgiza executables isn't the >>>> problem. It's probably to do with running out of disk space or memory. >>>> >>>> the snt2coooc executable in mgiza uses a lot of memory so may have been >>>> killed by the OS. The phrase table creation requires a lot of disk space to >>>> sort intermediate files. >>>> >>>> I would monitor those 2 things >>>> >>>> Hieu Hoang >>>> http://statmt.org/hieu >>>> >>>> On 31 July 2018 at 20:41, James Baker <[email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm having some peculiar issues with MGiza++. Using MGiza and Moses, >>>>> I've successfully built some translation models on my Ubuntu 16.04 desktop >>>>> machine. I'd now like to do the same thing, but on a machine hosted in >>>>> AWS. >>>>> >>>>> I'm using the same operating system, and as far as I can tell all my >>>>> versions are identical. The build of MGiza++ runs fine, reports no errors, >>>>> and produces output the same as on my desktop machine. However, when I try >>>>> to build the models, I get a whole load of errors and the resultant models >>>>> are empty (64 bytes for the reordering model, 0 bytes for the translation >>>>> model - the language model builds fine). >>>>> >>>>> The first "errors" I can see in the log seem to occur on stage 4 of >>>>> the Moses training script (train-model.perl): >>>>> >>>>> (4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58 >>>>> UTC 2018 >>>>> (/opt/model-builder/training/data.ru >>>>> ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex) >>>>> !Argument "anna" isn't numeric in numeric ge (>=) at >>>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >>>>> line 112, <A> line 1. >>>>> Use of uninitialized value $ei in numeric ge (>=) at >>>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >>>>> line 112, <A> line 1. >>>>> Use of uninitialized value $ei in hash element at >>>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >>>>> line 118, <A> line 1. >>>>> Use of uninitialized value $ei in array element at >>>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >>>>> line 121, <A> line 1. >>>>> Use of uninitialized value $ei in array element at >>>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm >>>>> line 123, <A> line 1. >>>>> ... >>>>> >>>>> There are a large number of errors of that nature, and following those >>>>> errors there are additional errors but I suspect these are caused by the >>>>> fact that this stage is failing. >>>>> >>>>> It's possible that there are earlier problems, but I'm not really sure >>>>> what to be looking for in the logs (for instance - there are some lines >>>>> warning about alignments in Model2 being 0 - is that an issue?). >>>>> >>>>> If I replace the MGiza binaries built on the AWS machine with the >>>>> binaries built on my desktop, it runs fine - so I know it's an issue with >>>>> MGiza and presumably something to do with my build. The commands I'm >>>>> running to build and install are as follows >>>>> >>>>> git clone https://github.com/moses-smt/mgiza.git >>>>> cd mgiza/mgizapp >>>>> cmake . >>>>> make >>>>> make install >>>>> cp bin/* ../../mosesdecoder/bin >>>>> cp scripts/merge_alignment.py ../../mosesdecoder/bin >>>>> >>>>> As I mentioned previously, these commands work fine on my desktop >>>>> machine which should be a very similar (if not identical) set up. >>>>> >>>>> Does anyone have any ideas as to what might be causing the problem >>>>> (or, more importantly, what I can do to fix it)? >>>>> >>>>> Thanks in advance, >>>>> James >>>>> >>>>> _______________________________________________ >>>>> Moses-support mailing list >>>>> [email protected] >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>> >>>>> >>>> >>
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
