Hi everyone, I would like to build large LMs from the Google Web1T 5-gram<http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T13> . I tried to use the goograms2ngrams.pl script from IRSTLM toolkit to extract raw n-gram counts but don't know how to build LMs (e.g. arpa file) from those count files.
Does anyone use to deal with it? Please advise me. Thanks in advance! -- Cheers, Vu
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
