Re: [Moses-support] Issues running MGiza on AWS machine

Hieu Hoang Wed, 01 Aug 2018 05:55:48 -0700

I'm out of ideas.

I've used aws and azure many times so it should work


Hieu Hoang
Sent while bumping into things

On Wed, 1 Aug 2018, 20:01 James Baker, <[email protected]> wrote:

> Alas, nothing erroneous that I can see in the logs (using
> ./train-model.perl > output.log 2>&1), and neither the memory usage nor the
> used disk space went over 10% during the training.
>
> James
>
> On Wed, 1 Aug 2018 at 08:56, Hieu Hoang <[email protected]> wrote:
>
>> redirect stdout and stderr into a file and grep for 'error'
>>
>> that usually turns up something
>>
>> Hieu Hoang
>> http://statmt.org/hieu
>>
>> On 1 August 2018 at 17:38, James Baker <[email protected]> wrote:
>>
>>> Thanks Hieu,
>>>
>>> I'll give that a go this morning and keep an eye on the disk space and
>>> RAM, although I would be surprised if that was the problem (I've got <3GB
>>> of training data, 64GB of RAM, and 100GB of disk space). It also wouldn't
>>> explain why binaries built on a different machine work, but binaries built
>>> on the same machine don't.
>>>
>>> Any other ideas for things I should be checking?
>>>
>>> Cheers,
>>> James
>>>
>>> On Wed, 1 Aug 2018 at 03:03, Hieu Hoang <[email protected]> wrote:
>>>
>>>> it's difficult to tell but I would say the mgiza executables isn't the
>>>> problem. It's probably to do with running out of disk space or memory.
>>>>
>>>> the snt2coooc executable in mgiza uses a lot of memory so may have been
>>>> killed by the OS. The phrase table creation requires a lot of disk space to
>>>> sort intermediate files.
>>>>
>>>> I would monitor those 2 things
>>>>
>>>> Hieu Hoang
>>>> http://statmt.org/hieu
>>>>
>>>> On 31 July 2018 at 20:41, James Baker <[email protected]> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm having some peculiar issues with MGiza++. Using MGiza and Moses,
>>>>> I've successfully built some translation models on my Ubuntu 16.04 desktop
>>>>> machine. I'd now like to do the same thing, but on a machine hosted in 
>>>>> AWS.
>>>>>
>>>>> I'm using the same operating system, and as far as I can tell all my
>>>>> versions are identical. The build of MGiza++ runs fine, reports no errors,
>>>>> and produces output the same as on my desktop machine. However, when I try
>>>>> to build the models, I get a whole load of errors and the resultant models
>>>>> are empty (64 bytes for the reordering model, 0 bytes for the translation
>>>>> model - the language model builds fine).
>>>>>
>>>>> The first "errors" I can see in the log seem to occur on stage 4 of
>>>>> the Moses training script (train-model.perl):
>>>>>
>>>>>    (4) generate lexical translation table 0-0 @ Tue Jul 31 10:22:58
>>>>> UTC 2018
>>>>>    (/opt/model-builder/training/data.ru
>>>>> ,/opt/model-builder/training/data.en,/opt/model-builder/training/model/lex)
>>>>>    !Argument "anna" isn't numeric in numeric ge (>=) at
>>>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>>>>> line 112, <A> line 1.
>>>>>    Use of uninitialized value $ei in numeric ge (>=) at
>>>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>>>>> line 112, <A> line 1.
>>>>>    Use of uninitialized value $ei in hash element at
>>>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>>>>> line 118, <A> line 1.
>>>>>    Use of uninitialized value $ei in array element at
>>>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>>>>> line 121, <A> line 1.
>>>>>    Use of uninitialized value $ei in array element at
>>>>> /opt/model-builder/mosesdecoder/scripts/training/LexicalTranslationModel.pm
>>>>> line 123, <A> line 1.
>>>>>    ...
>>>>>
>>>>> There are a large number of errors of that nature, and following those
>>>>> errors there are additional errors but I suspect these are caused by the
>>>>> fact that this stage is failing.
>>>>>
>>>>> It's possible that there are earlier problems, but I'm not really sure
>>>>> what to be looking for in the logs (for instance - there are some lines
>>>>> warning about alignments in Model2 being 0 - is that an issue?).
>>>>>
>>>>> If I replace the MGiza binaries built on the AWS machine with the
>>>>> binaries built on my desktop, it runs fine - so I know it's an issue with
>>>>> MGiza and presumably something to do with my build. The commands I'm
>>>>> running to build and install are as follows
>>>>>
>>>>>    git clone https://github.com/moses-smt/mgiza.git
>>>>>    cd mgiza/mgizapp
>>>>>    cmake .
>>>>>    make
>>>>>    make install
>>>>>    cp bin/* ../../mosesdecoder/bin
>>>>>    cp scripts/merge_alignment.py ../../mosesdecoder/bin
>>>>>
>>>>> As I mentioned previously, these commands work fine on my desktop
>>>>> machine which should be a very similar (if not identical) set up.
>>>>>
>>>>> Does anyone have any ideas as to what might be causing the problem
>>>>> (or, more importantly, what I can do to fix it)?
>>>>>
>>>>> Thanks in advance,
>>>>> James
>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> [email protected]
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>>>>
>>>>
>>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Issues running MGiza on AWS machine

Reply via email to