The command
moses/bin/build_binary trie -a 22 -b 8 -q 8 lm.arpa lm.kenlm will build a compressed binarized model with quantization. You can run moses/bin/build_binary lm.arpa without any parameters to get size estimates for different parameter settings. I would guess you will get a binarized LM of roughly 20 to 30 GB which is managable (provided the size you gave us is that of an uncompressed text file). You can also use lmplz to build pruned models in the first place, these will be much smaller. W dniu 2014-11-24 15:11, Tom Hoar napisaĆ(a): > After binarizing such a large ARPA file with KenLM, you'll need to configure > your moses.ini file to "lazily load the model using mmap." This involves > using lmodel-file code "9" vs code "8." More details here: > https://kheafield.com/code/kenlm/moses/ [2] > > Performance improves significantly if you store the binarized file on an SSD. > > On 11/24/2014 07:00 PM, Raj Dabre wrote: > > Hey Hoang, You should binarize the arpa file. The readme of the LM tool > (KenLM or IRSTLM or SRILM) will tell you how. Regards. > > On Mon, Nov 24, 2014 at 7:07 PM, Hoang Cuong <[email protected]> wrote: > > Hi all, > I have trained an (unpruned) 5-grams language model on a large corpus of 5 > billion words, resulting an ARPA-format file of roughly 300GB (is it a normal > LM size with such a big monolingual data?). This is obviously too big for > running an SMT system. > I read several works where their system uses language models trained on > similar monolingual corpus. Could you give me some advice how to handle this, > making it feasible to run SMT systems? > I appreciate your help a lot, > Best, > > -- > > Best Regards, > > Hoang Cuong > > SMTNerd > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support [1] > > -- > > Raj Dabre. > Research Student, Graduate School of Informatics, > Kyoto University. > > CSE MTech, IITB., 2011-2014 > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support [1] _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support [1] Links: ------ [1] http://mailman.mit.edu/mailman/listinfo/moses-support [2] https://kheafield.com/code/kenlm/moses/
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
