Re: [Moses-support] Deploying large models

liling tan Mon, 11 Dec 2017 03:56:35 -0800

Thank you Hieu for Moses2 tips!

BTW, is Moses2 backwards compatible for models trained with old Moses?


Regards,
Liling

On Mon, Dec 11, 2017 at 7:39 PM, Hieu Hoang <[email protected]> wrote:

> if you want fast decoding with more than 16 threads, use Moses2.
>    http://www.statmt.org/moses/?n=Site.Moses2
>
> Hieu Hoang
> http://moses-smt.org/
>
>
> On 11 December 2017 at 09:20, liling tan <[email protected]> wrote:
>
>> Dear Moses community/developers,
>>
>> I have a question on how to handle large models created using moses.
>>
>> I've a vanilla phrase-based model with
>>
>>    - PhraseDictionary num-features=4 input-factor=0 output-factor=0
>>    - LexicalReordering num-features=6 input-factor=0 output-factor=0
>>    - KENLM order=5 factor=0
>>
>> The size of the model is:
>>
>>    - compressed phrase table is 5.4GB,
>>    - compressed reordering table is 1.9GB and
>>    - quantized LM is 600MB
>>
>>
>> I'm running on a single 56 cores machine with 256GB RAM. Whenever I'm
>> decoding I use -threads 56 parameter.
>>
>> It's takes really long to load the table and after loading, it breaks
>> inconsistently at different lines when decoding, I notice that the RAM goes
>> into swap before it breaks.
>>
>> I've tried compact phrased table and get a
>>
>>    - 3.2GB .minphr
>>    - 1.5GV .minlexr
>>
>> And the same kind of random breakage happens when RAM goes into swap
>> after loading the phrase-table.
>>
>> Strangely, it still manage to decode ~500K sentences before it breaks.
>>
>> Then I've tried with ondisk phrasetable and it's around 37GB
>> uncompressed. Using the ondisk PT didn't cause breakage but the decoding
>> time is significantly increased, now it can only decode 15K sentences in an
>> hour.
>>
>> The setup is a little different from normal where we have the
>> train/dev/test split. Currently, my task is to decode the train set. I've
>> tried filtering the table with the trainset with
>> filter-model-given-input.pl but the size of the compressed table didn't
>> really decrease much.
>>
>> The entire training set is made up of 5M sentence pairs and it's taking
>> 3+ days just to decode ~1.5M sentences with ondisk PT.
>>
>>
>> My questions are:
>>
>>  - Are there best practices with regards to deploying large Moses models?
>>  - Why does the 5+GB phrase table take up > 250GB RAM when decoding?
>>  - How else should I filter/compress the phrase table?
>>  - Is it normal to decode only ~500K sentence a day given the machine
>> specs and the model size?
>>
>> I understand that I could split the train set up into two and train 2
>> models then cross-decode but if the training size is 10M sentence pairs,
>> we'll face the same issues.
>>
>> Thank you for reading the long post and thank you in advances for any
>> answers, discussions and enlightenment on this issue =)
>>
>> Regards,
>> LIling
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Deploying large models

Reply via email to