Hi, having such long sentences should cause all kinds of problems with word alignment, so I am bit puzzled that they still show up when pruning the phrase table.
A good way to prune the phrase table is to limit the length of phrases (max 5 does no harm, even max 4 is not a big deal), and reduce low probability phrase pairs ($MOSES/scripts/training/threshold-filter.perl). Is the phrase table the biggest model or the language model? For the latter, there are several compression options. -phi On Thu, Dec 3, 2015 at 12:32 AM, Hegde, Sujay <[email protected]> wrote: > HI Philipp, > > > > Thanks a lot. > > > > Actually it’s a VIRTUAL machine. > > > > Also we have compressed the models into .minphr and .minlexr > but we couldn’t prune it as while pruning we got an error saying some of the > sentences in the Corpus are too long and it cannot be pruned. > > > > We used pruning using SALM and get the following error: > > > > /mnt/hd1/git/salm/Bin/Linux/Index/IndexSA.O64 > opensub.train.it > > Initialize vocabulary file: opensub.train.it.id_voc > > Loading existing vocabulary file: opensub.train.it.id_voc > > Total 100 word types loaded > > Max VocID=100 > > Sentence 4152148 has more than 256 words. Can not handle such long sentence. > Please cut it short first! > > > > Is there anything we could do about the above? > > > > > > > > Thanks and Regards, > > Sujay, > > Xerox Business Services, Bangalore, India > > > > From: [email protected] [mailto:[email protected]] On Behalf Of Philipp > Koehn > Sent: 03 December 2015 03:13 > To: Hegde, Sujay > Cc: [email protected] > Subject: Re: [Moses-support] System requiremnts for Moses > > > > Hi, > > > > the machine you have is certainly sufficient even for large models. > > > > If you are running two language pairs in parallel and run into RAM problems, > you may want to look into ways to compress the model files (phrase table, > reordering table, language model) using either more efficient data > structures (e.g., various KENLM options), or pruning the models. > > > > -phi > > > > > > On Tue, Dec 1, 2015 at 5:08 AM, Hegde, Sujay <[email protected]> wrote: > > Dear Moses Admin, > > > > We are using Moses decoder for commercial environment. > > > > We have 132GB RAM, 1TB disk and quadcore Virtual Machine > with CentOs OS. > > > > We have 2 language pairs installed, and when running both > the models together the Translation hangs(Takes a LONG time). > > It is fine when we run only one language model. > > > > Is there any Specific System requirements needed for moses? > > Please let me know > > > > Thanks and Regards, > > Sujay, > > Xerox Business Services, Bangalore, India > > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
