Hi Kenneth, which number shall I use in the moses.ini 8 or 9 if i build my lm with these parameters -q 8 -b 8?
I got this error when I run moses: In LanguageModelKen::Load: nGramOrder = 5 will be ignored. Using whatever the file has. terminate called after throwing an instance of 'lm::FormatLoadException' what(): File looks like it should be loaded with mmap, but the test values don't match. Was it built on a different machine or with a different compiler? I have the feeling that my moses version needs to be updated! Thanks a lot Marco On Sat, Oct 8, 2011 at 1:02 PM, marco turchi <[email protected]> wrote: > Thanks! > I'm going to update my version. > > Cheers > Marco > > > On Sat, Oct 8, 2011 at 1:01 PM, Kenneth Heafield <[email protected]>wrote: > >> ** >> Fixed in revision 4314. There's still an issue with some SRILM models >> failing to build that I'll get to soon. >> >> On 10/08/11 11:52, marco turchi wrote: >> >> Hi, >> thanks a lot for the answer. >> Great, so I can use -m 2048 to build it. Do you think it is enough? >> >> Thanks again >> Marco >> >> On Sat, Oct 8, 2011 at 12:46 PM, Kenneth Heafield <[email protected]>wrote: >> >>> Hi, >>> >>> This looks like a bug in the trie implementation due to some recent >>> changes I made for left state minimization. I'll fix it soon. A workaround >>> is to pass a large -m option to build_binary. >>> >>> Sorry, >>> >>> Kenneth >>> >>> >>> On 10/08/11 11:34, marco turchi wrote: >>> >>> Dear All, >>> I'm trying to build a lm using a large dataset (> 11 M sentences). I have >>> generated the Arpa format with irstlm and now I'd like to binarize it using >>> kenlm. >>> >>> I have called the build_binary to estimate memory usage, and I got this >>> >>> Memory estimate: >>> type MB >>> probing 16129 assuming -p 1.5 >>> trie 7462 without quantization >>> trie 4361 assuming -q 8 -b 8 quantization >>> trie 6440 assuming -a 22 array pointer compression >>> trie 3339 assuming -a 22 -q 8 -b 8 array pointer compression and >>> quantization >>> >>> then I run the binarization in this way: >>> >>> /nfs/staging/turchmo/moses/kenlmNew/build_binary -i -t /tmp/ -q 8 -b 8 >>> trie irstLM.ARPA.txt irstLanguageModel.binary.lm >>> >>> but I got this error: >>> >>> lm/search_trie.cc:409 in void >>> lm::ngram::trie::<unnamed>::SanityCheckCounts(const std::vector<long >>> unsigned int, std::allocator<long unsigned int> >&, const std::vector<long >>> unsigned int, std::allocator<long unsigned int> >&) threw util::Exception'. >>> Longest count should be constant but it changed from 289546423 to >>> 289546405 Byte: 37297517525 >>> >>> I have had a look into the mailing list, but I do not find any post with >>> the same error. >>> >>> Any ideas? >>> >>> Thanks a lot >>> Marco >>> >>> >>> _______________________________________________ >>> Moses-support mailing >>> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> >>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> >> >> >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
