Hi all,
I have trained an (unpruned) 5-grams language model on a large corpus of 5
billion words, resulting an ARPA-format file of roughly 300GB (is it a
normal LM size with such a big monolingual data?). This is obviously too
big for running an SMT system.
I read several works where their system uses language models trained on
similar monolingual corpus. Could you give me some advice how to handle
this, making it feasible to run SMT systems?
I appreciate your help a lot,
Best,
-- 

*Best Regards,Hoang CuongSMTNerd*
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to