Thanks for sharing! Looks like building my Moses system from scratch finally finished, so I'll be making some memory benchmarks today too.
Just so I understand, you ran separate MERT for each of your three cases? Then MERT randomness should explain the insignificant difference in BLEU between result 1 and result 3. Kenneth On 10/29/10 10:06, supp...@precisiontranslationtools.com wrote: > Ken, > > Your new enhancements ROCK! Here are some numbers using rev 3675 and > IRSTLM 5.50.01 > > Machine: Core2Quad, 2.4 Ghz, 4 GB RAM > Data: EN-NL sample data, 37,500 segments (micro test sample) > 3 gram LM, 3 gram tables (for fast testing) > > Train LM with SRILM & > Train tables/tune/eval with > Moses/SRILM > multi-threading enabled: 75 minutes > BLEU Score: 0.2531 > > Train LM with IRSTLM > Train tables/tune/eval with > Moses/IRSLM, binarized memap, > single thread: 195 minutes > BLEU Score: 0.2496 > > Train LM with IRSTLM (ARPA) > Train tables/tune/eval with > Moses/KenLM, binarized memap, > multi-threaded: 50 minutes > BLEU Score: 0.2514 > > > > > On Wed, 27 Oct 2010 14:15:39 -0400, Kenneth Heafield <mo...@kheafield.com> > wrote: >> Revision 3671 introduces an updated version of kenlm. Queries are >> faster now (no more string vocab lookups, state is kept so backoffs cost >> less). The binary format has changed as a result; please rebuild your >> binary files. Timing is forthcoming. >> >> Kenneth >> >> On 10/18/10 20:31, Kenneth Heafield wrote: >>> Hi Moses, >>> >>> Introducing kenlm in Moses trunk. You no longer need to download a >>> separate language model to use Moses; it's distributed with Moses and >>> compiled in by default on UNIX. This is threadsafe language model >>> inference code that returns the same probabilities as SRI (up to >>> floating point rounding). It loads APRA files in 2/3 the time SRI > takes >>> and uses less memory too. Using kenlm is simple: in your [lmodel-file] >>> section, change the first digit to 8. For example, >>> >>> "0 0 2 foo.arpa" changes to "8 0 2 foo.arpa" >>> >>> For even faster loading, use the binary format: >>> >>> kenlm/build_binary foo.arpa foo.binary >>> >>> then simply provide the binary filename in your moses.ini e.g. >>> "8 0 2 foo.binary"; it auto detects binary files using magic bytes at >>> the beginning. >>> >>> The code is ready for use and provides correct results. Inference is >>> slower than it should be due to inefficiencies in the Moses-side > wrapper >>> code (it does a vocab lookup for all 5 words every time). I'm working >>> on it and once this is done I'll post some benchmarks against SRI and >>> IRST. The binary format is subject to change, but contains a version >>> number so on very rare occasions after, new versions will tell you to >>> rebuild your binary files. Windows is currently not supported (it uses >>> mmap) though I welcome contributions using #ifdef and > CreateFileMapping. >>> >>> Have fun and let me know about your experiences with it. >>> >>> "Ken" >>> _______________________________________________ >>> Moses-support mailing list >>> Moses-support@mit.edu >>> http://mailman.mit.edu/mailman/listinfo/moses-support >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support