do up have comparison figures for moses v moses2? I never managed to get
reliable info for more than 32 cores

config/moses.ini files would be good too

Hieu Hoang
http://moses-smt.org/


On 13 December 2017 at 06:10, liling tan <[email protected]> wrote:

> Ah, that's why the phrase-table is exploding... I've never decoded more
> than 100K sentences before =)
>
> binarize4moses2.perl is awesome! Let me see how much speed up I get with
> Moses2 and pruned tables.
>
> Thank you Hieu and Barry!
>
>
>
>
> On Tue, Dec 12, 2017 at 6:38 PM, Hieu Hoang <[email protected]> wrote:
>
>> Barry is correct, having 750,000 translations for '.' severely degrades
>> speed.
>>
>> I had forgotten about the script I created:
>>    scripts/generic/binarize4moses2.perl
>> which takes in the phrase table & lex reordering model, and prunes them
>> and runs addLexROtoPT. Basically, everything you need to do to create a
>> fast model for Moses2
>>
>> Hieu Hoang
>> http://moses-smt.org/
>>
>>
>> On 12 December 2017 at 09:16, Barry Haddow <[email protected]>
>> wrote:
>>
>>> Hi Liling
>>>
>>> The short answer is you need need to prune/filter your phrase table
>>> prior to creating the compact phrase table. I don't mean "filter model
>>> given input", because that won't make much difference if you have a very
>>> large input, I mean getting rid of rare translations which won't be used
>>> anyway.
>>>
>>> The compact phrase does not do pruning, it ends up being done in memory,
>>> so if you have 750,000 translations of the full-stop in your model then
>>> they all get loaded into memory, before Moses selects the top 20.
>>>
>>> You can use prunePhraseTable from Moses (which bizarrely needs to load a
>>> phrase table in order to parse the config file, last time I looked). You
>>> could also apply Johnson / entropic pruning, whatever works for you,
>>>
>>> cheers - Barry
>>>
>>>
>>> On 11/12/17 09:20, liling tan wrote:
>>>
>>> Dear Moses community/developers,
>>>
>>> I have a question on how to handle large models created using moses.
>>>
>>> I've a vanilla phrase-based model with
>>>
>>>    - PhraseDictionary num-features=4 input-factor=0 output-factor=0
>>>    - LexicalReordering num-features=6 input-factor=0 output-factor=0
>>>    - KENLM order=5 factor=0
>>>
>>> The size of the model is:
>>>
>>>    - compressed phrase table is 5.4GB,
>>>    - compressed reordering table is 1.9GB and
>>>    - quantized LM is 600MB
>>>
>>>
>>> I'm running on a single 56 cores machine with 256GB RAM. Whenever I'm
>>> decoding I use -threads 56 parameter.
>>>
>>> It's takes really long to load the table and after loading, it breaks
>>> inconsistently at different lines when decoding, I notice that the RAM goes
>>> into swap before it breaks.
>>>
>>> I've tried compact phrased table and get a
>>>
>>>    - 3.2GB .minphr
>>>    - 1.5GV .minlexr
>>>
>>> And the same kind of random breakage happens when RAM goes into swap
>>> after loading the phrase-table.
>>>
>>> Strangely, it still manage to decode ~500K sentences before it breaks.
>>>
>>> Then I've tried with ondisk phrasetable and it's around 37GB
>>> uncompressed. Using the ondisk PT didn't cause breakage but the decoding
>>> time is significantly increased, now it can only decode 15K sentences in an
>>> hour.
>>>
>>> The setup is a little different from normal where we have the
>>> train/dev/test split. Currently, my task is to decode the train set. I've
>>> tried filtering the table with the trainset with
>>> filter-model-given-input.pl but the size of the compressed table didn't
>>> really decrease much.
>>>
>>> The entire training set is made up of 5M sentence pairs and it's taking
>>> 3+ days just to decode ~1.5M sentences with ondisk PT.
>>>
>>>
>>> My questions are:
>>>
>>>  - Are there best practices with regards to deploying large Moses models?
>>>  - Why does the 5+GB phrase table take up > 250GB RAM when decoding?
>>>  - How else should I filter/compress the phrase table?
>>>  - Is it normal to decode only ~500K sentence a day given the machine
>>> specs and the model size?
>>>
>>> I understand that I could split the train set up into two and train 2
>>> models then cross-decode but if the training size is 10M sentence pairs,
>>> we'll face the same issues.
>>>
>>> Thank you for reading the long post and thank you in advances for any
>>> answers, discussions and enlightenment on this issue =)
>>>
>>> Regards,
>>> LIling
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing 
>>> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>>
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to