Re: [Moses-support] high-order-grams in kenlm and srilm

Kenneth Heafield Mon, 18 May 2015 19:15:41 -0700

Hi,

        There are more n-grams.  I'm guessing you're running low on RAM.  Are
you referring to estimating or querying?


        To estimate such a model from data, you simply need to use the -o
option to lmplz as you already do.  lmplz already lets you specify the
memory usage.

        For models above 7, to query you will need to recompile with e.g.
--max-kenlm=order=8 .

        Regarding compression, take a look at
http://kheafield.com/code/kenlm/structures/

        This all said, I doubt you'll get much useful out of a 500 MB data set
with higher orders.

Kenneth

On 05/18/2015 09:48 PM, koormoosh wrote:
> Hello,
> 
> I wonder why it takes lot of time to do language modelling with kenlm
> and srilm when n goes beyond 6 (even on a relatively small dataset: 500
> MB), and is there a way to actually do high-order (6,7,8-gram) language
> modelling with srilm and kenlm on a laptop (12GB RAM)? I assume there is
> a flag somewhere that I need to set when creating the arpa or binary
> file, or during the test (computing the perplexity etc...).
> 
> Thanks,
> -K
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] high-order-grams in kenlm and srilm

Reply via email to