The Moses1 was using the pruned ProbingPT created by the binarized4moses2.pl =)
I think the speed up might be non-linear when it compared against the pruned phrase-table size; the larger the more speedups. But that needs more rigorous testing to prove ;P On Thu, Dec 14, 2017 at 7:37 PM, Hieu Hoang <[email protected]> wrote: > cool, I was expecting only single digits improvements. If the pt in Moses1 > hadn't been pruned, the speedup is a lot to do with the pruning i think > > Hieu Hoang > http://moses-smt.org/ > > > On 14 December 2017 at 07:41, liling tan <[email protected]> wrote: > >> With Moses2 and ProbingPT, I got 4M sentence, 86M words for 14 hours on >> moses2 for -threads 50 for 56 cores. So it's around 6M words per hour for >> Moses2. >> >> With Moses1, ProbingPT and gzipped LO table but with 32K sentences, 280K >> words per hour for -threads 50 for 56 cores >> >> Moses2 is 20x faster than Moses1 for my model!! >> >> For Moses1 my moses.ini : >> >> >> ######################### >> ### MOSES CONFIG FILE ### >> ######################### >> >> # input factors >> [input-factors] >> 0 >> >> # mapping steps >> [mapping] >> 0 T 0 >> >> [distortion-limit] >> 6 >> >> # feature functions >> [feature] >> UnknownWordPenalty >> WordPenalty >> PhrasePenalty >> #PhraseDictionaryMemory name=TranslationModel0 num-features=4 >> path=/home/ltan/momo/pt.gz input-factor=0 output-factor=0 >> ProbingPT name=TranslationModel0 num-features=4 >> path=/home/ltan/momo/momo-bin input-factor=0 output-factor=0 >> LexicalReordering name=LexicalReordering0 num-features=6 >> type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 >> path=/home/ltan/momo/reordering-table.wbe-msd-bidirectional-fe.gz >> #LexicalReordering name=LexicalReordering0 num-features=6 >> type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 >> property-index=0 >> >> Distortion >> KENLM name=LM0 factor=0 path=/home/ltan/momo/lm.ja.kenlm order=5 >> >> >> >> On Thu, Dec 14, 2017 at 8:58 AM, liling tan <[email protected]> wrote: >> >>> I don't have a comparison between moses vs moses2. I'll give some moses >>> numbers once the full dataset is decoded. And I can repeat the decoding for >>> moses on the same machine. >>> >>> BTW, the ProbingPT directory created by binarize4moses2.pl , could it >>> be used for old Moses? >>> Or would I have to use re-prune the phrase-table and then use >>> the PhraseDictionaryMemory and LexicalReordering separatedly? >>> >>> But I'm getting 4M sentence, 86M words for 14 hours on moses2 for >>> -threads 50 for 56 cores. >>> >>> >>> ######################### >>> ### MOSES CONFIG FILE ### >>> ######################### >>> >>> # input factors >>> [input-factors] >>> 0 >>> >>> # mapping steps >>> [mapping] >>> 0 T 0 >>> >>> [distortion-limit] >>> 6 >>> >>> # feature functions >>> [feature] >>> UnknownWordPenalty >>> WordPenalty >>> PhrasePenalty >>> #PhraseDictionaryMemory name=TranslationModel0 num-features=4 >>> path=/home/ltan/momo/phrase-table.gz input-factor=0 output-factor=0 >>> ProbingPT name=TranslationModel0 num-features=4 >>> path=/home/ltan/momo/momo-bin input-factor=0 output-factor=0 >>> #LexicalReordering name=LexicalReordering0 num-features=6 >>> type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 >>> path=/home/ltan/momo/reordering-table.wbe-msd-bidirectional-fe.gz >>> LexicalReordering name=LexicalReordering0 num-features=6 >>> type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 >>> property-index=0 >>> >>> Distortion >>> KENLM name=LM0 factor=0 path=/home/ltan/momo/lm.ja.kenlm order=5 >>> >>> >>> On Thu, Dec 14, 2017 at 3:52 AM, Hieu Hoang <[email protected]> wrote: >>> >>>> do up have comparison figures for moses v moses2? I never managed to >>>> get reliable info for more than 32 cores >>>> >>>> config/moses.ini files would be good too >>>> >>>> Hieu Hoang >>>> http://moses-smt.org/ >>>> >>>> >>>> On 13 December 2017 at 06:10, liling tan <[email protected]> wrote: >>>> >>>>> Ah, that's why the phrase-table is exploding... I've never decoded >>>>> more than 100K sentences before =) >>>>> >>>>> binarize4moses2.perl is awesome! Let me see how much speed up I get >>>>> with Moses2 and pruned tables. >>>>> >>>>> Thank you Hieu and Barry! >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, Dec 12, 2017 at 6:38 PM, Hieu Hoang <[email protected]> >>>>> wrote: >>>>> >>>>>> Barry is correct, having 750,000 translations for '.' severely >>>>>> degrades speed. >>>>>> >>>>>> I had forgotten about the script I created: >>>>>> scripts/generic/binarize4moses2.perl >>>>>> which takes in the phrase table & lex reordering model, and prunes >>>>>> them and runs addLexROtoPT. Basically, everything you need to do to >>>>>> create >>>>>> a fast model for Moses2 >>>>>> >>>>>> Hieu Hoang >>>>>> http://moses-smt.org/ >>>>>> >>>>>> >>>>>> On 12 December 2017 at 09:16, Barry Haddow < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Liling >>>>>>> >>>>>>> The short answer is you need need to prune/filter your phrase table >>>>>>> prior to creating the compact phrase table. I don't mean "filter model >>>>>>> given input", because that won't make much difference if you have a very >>>>>>> large input, I mean getting rid of rare translations which won't be used >>>>>>> anyway. >>>>>>> >>>>>>> The compact phrase does not do pruning, it ends up being done in >>>>>>> memory, so if you have 750,000 translations of the full-stop in your >>>>>>> model >>>>>>> then they all get loaded into memory, before Moses selects the top 20. >>>>>>> >>>>>>> You can use prunePhraseTable from Moses (which bizarrely needs to >>>>>>> load a phrase table in order to parse the config file, last time I >>>>>>> looked). >>>>>>> You could also apply Johnson / entropic pruning, whatever works for you, >>>>>>> >>>>>>> cheers - Barry >>>>>>> >>>>>>> >>>>>>> On 11/12/17 09:20, liling tan wrote: >>>>>>> >>>>>>> Dear Moses community/developers, >>>>>>> >>>>>>> I have a question on how to handle large models created using moses. >>>>>>> >>>>>>> I've a vanilla phrase-based model with >>>>>>> >>>>>>> - PhraseDictionary num-features=4 input-factor=0 output-factor=0 >>>>>>> - LexicalReordering num-features=6 input-factor=0 output-factor=0 >>>>>>> - KENLM order=5 factor=0 >>>>>>> >>>>>>> The size of the model is: >>>>>>> >>>>>>> - compressed phrase table is 5.4GB, >>>>>>> - compressed reordering table is 1.9GB and >>>>>>> - quantized LM is 600MB >>>>>>> >>>>>>> >>>>>>> I'm running on a single 56 cores machine with 256GB RAM. Whenever >>>>>>> I'm decoding I use -threads 56 parameter. >>>>>>> >>>>>>> It's takes really long to load the table and after loading, it >>>>>>> breaks inconsistently at different lines when decoding, I notice that >>>>>>> the >>>>>>> RAM goes into swap before it breaks. >>>>>>> >>>>>>> I've tried compact phrased table and get a >>>>>>> >>>>>>> - 3.2GB .minphr >>>>>>> - 1.5GV .minlexr >>>>>>> >>>>>>> And the same kind of random breakage happens when RAM goes into swap >>>>>>> after loading the phrase-table. >>>>>>> >>>>>>> Strangely, it still manage to decode ~500K sentences before it >>>>>>> breaks. >>>>>>> >>>>>>> Then I've tried with ondisk phrasetable and it's around 37GB >>>>>>> uncompressed. Using the ondisk PT didn't cause breakage but the decoding >>>>>>> time is significantly increased, now it can only decode 15K sentences >>>>>>> in an >>>>>>> hour. >>>>>>> >>>>>>> The setup is a little different from normal where we have the >>>>>>> train/dev/test split. Currently, my task is to decode the train set. >>>>>>> I've >>>>>>> tried filtering the table with the trainset with >>>>>>> filter-model-given-input.pl but the size of the compressed table >>>>>>> didn't really decrease much. >>>>>>> >>>>>>> The entire training set is made up of 5M sentence pairs and it's >>>>>>> taking 3+ days just to decode ~1.5M sentences with ondisk PT. >>>>>>> >>>>>>> >>>>>>> My questions are: >>>>>>> >>>>>>> - Are there best practices with regards to deploying large Moses >>>>>>> models? >>>>>>> - Why does the 5+GB phrase table take up > 250GB RAM when decoding? >>>>>>> - How else should I filter/compress the phrase table? >>>>>>> - Is it normal to decode only ~500K sentence a day given the >>>>>>> machine specs and the model size? >>>>>>> >>>>>>> I understand that I could split the train set up into two and train >>>>>>> 2 models then cross-decode but if the training size is 10M sentence >>>>>>> pairs, >>>>>>> we'll face the same issues. >>>>>>> >>>>>>> Thank you for reading the long post and thank you in advances for >>>>>>> any answers, discussions and enlightenment on this issue =) >>>>>>> >>>>>>> Regards, >>>>>>> LIling >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Moses-support mailing >>>>>>> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>> >>>>>>> >>>>>>> >>>>>>> The University of Edinburgh is a charitable body, registered in >>>>>>> Scotland, with registration number SC005336. >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Moses-support mailing list >>>>>>> [email protected] >>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
