yes, but it only supports ProbingPT.

And it's best if you run the program
   addLexROtoPT
to merge the pt and lex reordering model. There is no equivalent in Moses(1)

Hieu Hoang
http://moses-smt.org/


On 11 December 2017 at 11:51, liling tan <[email protected]> wrote:

> Thank you Hieu for Moses2 tips!
>
> BTW, is Moses2 backwards compatible for models trained with old Moses?
>
> Regards,
> Liling
>
> On Mon, Dec 11, 2017 at 7:39 PM, Hieu Hoang <[email protected]> wrote:
>
>> if you want fast decoding with more than 16 threads, use Moses2.
>>    http://www.statmt.org/moses/?n=Site.Moses2
>>
>> Hieu Hoang
>> http://moses-smt.org/
>>
>>
>> On 11 December 2017 at 09:20, liling tan <[email protected]> wrote:
>>
>>> Dear Moses community/developers,
>>>
>>> I have a question on how to handle large models created using moses.
>>>
>>> I've a vanilla phrase-based model with
>>>
>>>    - PhraseDictionary num-features=4 input-factor=0 output-factor=0
>>>    - LexicalReordering num-features=6 input-factor=0 output-factor=0
>>>    - KENLM order=5 factor=0
>>>
>>> The size of the model is:
>>>
>>>    - compressed phrase table is 5.4GB,
>>>    - compressed reordering table is 1.9GB and
>>>    - quantized LM is 600MB
>>>
>>>
>>> I'm running on a single 56 cores machine with 256GB RAM. Whenever I'm
>>> decoding I use -threads 56 parameter.
>>>
>>> It's takes really long to load the table and after loading, it breaks
>>> inconsistently at different lines when decoding, I notice that the RAM goes
>>> into swap before it breaks.
>>>
>>> I've tried compact phrased table and get a
>>>
>>>    - 3.2GB .minphr
>>>    - 1.5GV .minlexr
>>>
>>> And the same kind of random breakage happens when RAM goes into swap
>>> after loading the phrase-table.
>>>
>>> Strangely, it still manage to decode ~500K sentences before it breaks.
>>>
>>> Then I've tried with ondisk phrasetable and it's around 37GB
>>> uncompressed. Using the ondisk PT didn't cause breakage but the decoding
>>> time is significantly increased, now it can only decode 15K sentences in an
>>> hour.
>>>
>>> The setup is a little different from normal where we have the
>>> train/dev/test split. Currently, my task is to decode the train set. I've
>>> tried filtering the table with the trainset with
>>> filter-model-given-input.pl but the size of the compressed table didn't
>>> really decrease much.
>>>
>>> The entire training set is made up of 5M sentence pairs and it's taking
>>> 3+ days just to decode ~1.5M sentences with ondisk PT.
>>>
>>>
>>> My questions are:
>>>
>>>  - Are there best practices with regards to deploying large Moses models?
>>>  - Why does the 5+GB phrase table take up > 250GB RAM when decoding?
>>>  - How else should I filter/compress the phrase table?
>>>  - Is it normal to decode only ~500K sentence a day given the machine
>>> specs and the model size?
>>>
>>> I understand that I could split the train set up into two and train 2
>>> models then cross-decode but if the training size is 10M sentence pairs,
>>> we'll face the same issues.
>>>
>>> Thank you for reading the long post and thank you in advances for any
>>> answers, discussions and enlightenment on this issue =)
>>>
>>> Regards,
>>> LIling
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to