Re: [Moses-support] factored models - memory and speed

Ondrej Bojar Tue, 26 Feb 2008 04:06:59 -0800

Hi, all.

My experience (quite old, byt anyway) was that e.g. more than two 
generation steps were quite too many already. The issue is not that of 
the number of sentences in your test corpus, so splitting the test 
corpus will probably not help. The problems is simply too many 
translation options and too many (and large) probability tables. 
Binarizing the tables should ease memory problems, but the fact of too 
many translation options inevitably leads to 1) long processing time, as 
all immediate steps have to considered, and 2) more search errors, as 
the various stacks will not be able to hold varied-enough partial 
hypotheses. (My impression is that the more models we have, the flatter 
the final space of hypotheses is, but I'd like experienced statistitians 
to correct me on this; Miles?)


As a hint for Joerg, here is how to run MERT training on Sun Grid Engine 
(other clusters are not supported by our scripts). It's very advisable 
to have all paths absolute in your moses.ini (see 
training/absolutize_moses_model.pl) and expect many random shell bugs 
before you get this indeed running:

$SCRIPTS_ROOTDIR/training/mert-moses.pl
   --working-dir=mert-tuning \
   `pwd`/tuning.in \
   `pwd`/tuning.ref. \
   `pwd`/moses \
   `pwd`/moses.abs.ini \
   --jobs=10 --queue-flags=' -p -200 -cwd -S /bin/bash '  \
   --decoder-flags="-dl 6 " \
   || (echo 'Exiting'; exit 1) || exit 1

And here is how to run moses parallel, without merting:

$SCRIPTS_ROOTDIR/generic/moses-parallel.pl --jobs=10 \
   --queue-parameters=' -p -200 -cwd -S /bin/bash ' \
   -decoder ./moses \
   -input-file ./evaluation.in \
   -config ./filtered-for-eval-opt/moses.ini \
   > evaluation.opt.out \
    || (echo 'Exiting'; exit 1) || exit 1


As said above, don't expect this to solve your memory troubles of 
complex scenarios, I'd suggest manually searching for a single 
short-enough sentence that gets through...

Ondrej.

J.Tiedemann wrote:
> do you do this for tuning weights? I saw something about running mert 
> in parallel on a cluster. is that what you're going to do?
> how exactly does this work? can I easily use this option on a linux 
> cluster?
> 
> otherwise, would it be any good to try tuning sequentially with 
> several small development sets? anyone who tried that?
> 
> jörg
> 
> 
> On Mon, 25 Feb 2008 09:52:30 +0100
>   Alexandre Allauzen <[EMAIL PROTECTED]> wrote:
>> Hello, we've the same problem, and we will try to split the dev in 
>> subpart to run moses in parallel.
>> Alexandre.
>> J.Tiedemann wrote:
>>> Hello Moses users and developers,
>>>
>>>
>>> I'm facing problems with memory requirements and decoding speed when 
>>> running a factored model on Europarl data. I trained a model with 
>>> lemma and POS factors with about 1 million sentence pairs but 
>>> running 
>>> moses always fails after some sentences because of memory allocation 
>>> errors (terminate called after throwing an instance of 
>>> 'std::bad_alloc')
>>>
>>> I use 3 translation factors and 2 generation factors together with 
>>> lexicalized reordering models. I already tried to reduce memory 
>>> usage 
>>> by compiling phrase and reordering tables to binary formats and by 
>>> switching to IRSTLM with binary LMs. I also added 
>>> '[use-persistent-cache] 0' to my config file but still moses 
>>> allocates 
>>> between 2 and 4GB of internal memory and after about 20 test 
>>> sentences 
>>> the process crashes. This also means that I cannot run mert on any 
>>> tuning data. Anyway, the decoding also becomes so slow that tuning 
>>> would probably not be feasible for my data (one sentence takes 
>>> between 
>>> 200 and 2000 seconds to translate).
>>>
>>> I'm just wondering what other moses users experienced with factored 
>>> models and what I should expect when training on rather large data. 
>>> Is 
>>> there any other trick I could try to get at least a result back for 
>>> my 
>>> test set? Do I just need more memory? By the way, filtering the 
>>> phrase 
>>> tables according to input data didn't work for me either (still too 
>>> big to fit into memory). What are the limits and what are the system 
>>> requirements?
>>>
>>> I also wonder if the cache can be controlled somehow to get a 
>>> reasonable decoding speed without running out of memory so quickly. 
>>> With caching switched on I cannot even run more than a couple of 
>>> sentences.
>>>
>>> Using the latest release improved the situation a little bit but I 
>>> still run out of memory. Any help would be greatly appreciated. I'm 
>>> just curious to see the results with a factorized model compared to 
>>> the baseline approach with plain text only.
>>>
>>> cheers,
>>>
>>> Jörg
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>   
>>
>> -- 
>>     Alexandre Allauzen
>> Univ. Paris XI, LIMSI-CNRS
>> Tel : 01.69.85.80.64 (80.88)
>> Bur : 114     LIMSI Bat. 508
>>     [EMAIL PROTECTED]
>>
> 
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support

-- 
Ondrej Bojar (mailto:[EMAIL PROTECTED] / [EMAIL PROTECTED])
http://www.cuni.cz/~obo


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] factored models - memory and speed

Reply via email to