Re: [Moses-support] Deploying large models

Hieu Hoang Thu, 14 Dec 2017 08:51:50 -0800

you're right, it's highly non-linear and depends on lots of parameters.


Nice to know it's so much faster in this case


On 14/12/17 13:50, liling tan wrote:

The Moses1 was using the pruned ProbingPT created by thebinarized4moses2.pl <http://binarized4moses2.pl> =)

I think the speed up might be non-linear when it compared against thepruned phrase-table size; the larger the more speedups. But that needsmore rigorous testing to prove ;P

On Thu, Dec 14, 2017 at 7:37 PM, Hieu Hoang <[email protected]<mailto:[email protected]>> wrote:


    cool, I was expecting only single digits improvements. If the pt
    in Moses1 hadn't been pruned, the speedup is a lot to do with the
    pruning i think

    Hieu Hoang
    http://moses-smt.org/


    On 14 December 2017 at 07:41, liling tan <[email protected]
    <mailto:[email protected]>> wrote:

        With Moses2 and ProbingPT, I got 4M sentence, 86M words for 14
        hours on moses2 for -threads 50 for 56 cores. So it's around
        6M words per hour for Moses2.

        With Moses1, ProbingPT and gzipped LO table but with 32K
        sentences, 280K words per hour for -threads 50 for 56 cores

        Moses2 is 20x faster than Moses1 for my model!!

        For Moses1 my moses.ini :


        #########################
        ### MOSES CONFIG FILE ###
        #########################

        # input factors
        [input-factors]
        0

        # mapping steps
        [mapping]
        0 T 0

        [distortion-limit]
        6

        # feature functions
        [feature]
        UnknownWordPenalty
        WordPenalty
        PhrasePenalty
        #PhraseDictionaryMemory name=TranslationModel0 num-features=4
        path=/home/ltan/momo/pt.gz input-factor=0 output-factor=0
        ProbingPT name=TranslationModel0 num-features=4
        path=/home/ltan/momo/momo-bin input-factor=0 output-factor=0
        LexicalReordering name=LexicalReordering0 num-features=6
        type=wbe-msd-bidirectional-fe-allff input-factor=0
        output-factor=0
        path=/home/ltan/momo/reordering-table.wbe-msd-bidirectional-fe.gz
        #LexicalReordering name=LexicalReordering0 num-features=6
        type=wbe-msd-bidirectional-fe-allff input-factor=0
        output-factor=0 property-index=0

        Distortion
        KENLM name=LM0 factor=0 path=/home/ltan/momo/lm.ja.kenlm order=5



        On Thu, Dec 14, 2017 at 8:58 AM, liling tan
        <[email protected] <mailto:[email protected]>> wrote:

            I don't have a comparison between moses vs moses2. I'll
            give some moses numbers once the full dataset is decoded.
            And I can repeat the decoding for moses on the same machine.

            BTW, the ProbingPT directory created by binarize4moses2.pl
            <http://binarize4moses2.pl> , could it be used for old Moses?
            Or would I have to use re-prune the phrase-table and then
            use the PhraseDictionaryMemory and  LexicalReordering
            separatedly?

            But I'm getting 4M sentence, 86M words for 14 hours on
            moses2 for -threads 50 for 56 cores.


            #########################
            ### MOSES CONFIG FILE ###
            #########################

            # input factors
            [input-factors]
            0

            # mapping steps
            [mapping]
            0 T 0

            [distortion-limit]
            6

            # feature functions
            [feature]
            UnknownWordPenalty
            WordPenalty
            PhrasePenalty
            #PhraseDictionaryMemory name=TranslationModel0
            num-features=4 path=/home/ltan/momo/phrase-table.gz
            input-factor=0 output-factor=0
            ProbingPT name=TranslationModel0 num-features=4
            path=/home/ltan/momo/momo-bin input-factor=0 output-factor=0
            #LexicalReordering name=LexicalReordering0 num-features=6
            type=wbe-msd-bidirectional-fe-allff input-factor=0
            output-factor=0
            path=/home/ltan/momo/reordering-table.wbe-msd-bidirectional-fe.gz
            LexicalReordering name=LexicalReordering0 num-features=6
            type=wbe-msd-bidirectional-fe-allff input-factor=0
            output-factor=0 property-index=0

            Distortion
            KENLM name=LM0 factor=0 path=/home/ltan/momo/lm.ja.kenlm
            order=5


            On Thu, Dec 14, 2017 at 3:52 AM, Hieu Hoang
            <[email protected] <mailto:[email protected]>> wrote:

                do up have comparison figures for moses v moses2? I
                never managed to get reliable info for more than 32 cores

                config/moses.ini files would be good too

                Hieu Hoang
                http://moses-smt.org/


                On 13 December 2017 at 06:10, liling tan
                <[email protected] <mailto:[email protected]>> wrote:

                    Ah, that's why the phrase-table is exploding...
                    I've never decoded more than 100K sentences before =)

                    binarize4moses2.perl is awesome! Let me see how
                    much speed up I get with Moses2 and pruned tables.

                    Thank you Hieu and Barry!




                    On Tue, Dec 12, 2017 at 6:38 PM, Hieu Hoang
                    <[email protected] <mailto:[email protected]>>
                    wrote:

                        Barry is correct, having 750,000 translations
                        for '.' severely degrades speed.

                        I had forgotten about the script I created:
                        scripts/generic/binarize4moses2.perl
                        which takes in the phrase table & lex
                        reordering model, and prunes them and runs
                        addLexROtoPT. Basically, everything you need
                        to do to create a fast model for Moses2

                        Hieu Hoang
                        http://moses-smt.org/


                        On 12 December 2017 at 09:16, Barry Haddow
                        <[email protected]
                        <mailto:[email protected]>> wrote:

                            Hi Liling

                            The short answer is you need need to
                            prune/filter your phrase table prior to
                            creating the compact phrase table. I don't
                            mean "filter model given input", because
                            that won't make much difference if you
                            have a very large input, I mean getting
                            rid of rare translations which won't be
                            used anyway.

                            The compact phrase does not do pruning, it
                            ends up being done in memory, so if you
                            have 750,000 translations of the full-stop
                            in your model then they all get loaded
                            into memory, before Moses selects the top 20.

                            You can use prunePhraseTable from Moses
                            (which bizarrely needs to load a phrase
                            table in order to parse the config file,
                            last time I looked). You could also apply
                            Johnson / entropic pruning, whatever works
                            for you,

                            cheers - Barry


                            On 11/12/17 09:20, liling tan wrote:

                            Dear Moses community/developers,

                            I have a question on how to handle large
                            models created using moses.

                            I've a vanilla phrase-based model with

                              * PhraseDictionary num-features=4 input-factor=0
                                output-factor=0
                              * LexicalReordering num-features=6 input-factor=0
                                output-factor=0
                              * KENLM order=5 factor=0

                            The size of the model is:

                              * compressed phrase table is 5.4GB,
                              * compressed reordering table is 1.9GB and
                              * quantized LM is 600MB


                            I'm running on a single 56 cores machine
                            with 256GB RAM. Whenever I'm decoding I
                            use -threads 56 parameter.

                            It's takes really long to load the table
                            and after loading, it breaks
                            inconsistently at different lines when
                            decoding, I notice that the RAM goes into
                            swap before it breaks.

                            I've tried compact phrased table and get a

                              * 3.2GB .minphr
                              * 1.5GV .minlexr

                            And the same kind of random breakage
                            happens when RAM goes into swap after
                            loading the phrase-table.

                            Strangely, it still manage to decode
                            ~500K sentences before it breaks.

                            Then I've tried with ondisk phrasetable
                            and it's around 37GB uncompressed. Using
                            the ondisk PT didn't cause breakage but
                            the decoding time is significantly
                            increased, now it can only decode 15K
                            sentences in an hour.

                            The setup is a little different from
                            normal where we have the train/dev/test
                            split. Currently, my task is to decode
                            the train set. I've tried filtering the
                            table with the trainset with
                            filter-model-given-input.pl
                            <http://filter-model-given-input.pl> but
                            the size of the compressed table didn't
                            really decrease much.

                            The entire training set is made up of 5M
                            sentence pairs and it's taking 3+ days
                            just to decode ~1.5M sentences with
                            ondisk PT.


                            My questions are:

                             - Are there best practices with regards
                            to deploying large Moses models?
                             - Why does the 5+GB phrase table take up
                            > 250GB RAM when decoding?
                             - How else should I filter/compress the
                            phrase table?
                             - Is it normal to decode only ~500K
                            sentence a day given the machine specs
                            and the model size?

                            I understand that I could split the train
                            set up into two and train 2 models then
                            cross-decode but if the training size is
                            10M sentence pairs, we'll face the same
                            issues.

                            Thank you for reading the long post and
                            thank you in advances for any answers,
                            discussions and enlightenment on this
                            issue =)

                            Regards,
                            LIling


                            _______________________________________________
                            Moses-support mailing list
                            [email protected]
                            <mailto:[email protected]>
                            
http://mailman.mit.edu/mailman/listinfo/moses-support
                            
<http://mailman.mit.edu/mailman/listinfo/moses-support>



                            The University of Edinburgh is a
                            charitable body, registered in
                            Scotland, with registration number SC005336.

                            _______________________________________________
                            Moses-support mailing list
                            [email protected]
                            <mailto:[email protected]>
                            
http://mailman.mit.edu/mailman/listinfo/moses-support
                            
<http://mailman.mit.edu/mailman/listinfo/moses-support>


--
Hieu Hoang
http://moses-smt.org/

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Deploying large models

Reply via email to