Hi,

    Number 8 means prefault and number 9 means lazy mmap.  It's an
option and orthogonal to the data structure. 

    Since the binary file is the in-memory representation, I do paranoid
checks to make sure your machine represents floats, 64-bit integers, and
such in the same way.  For example a 32-bit build will have different
alignment than a 64-bit build.  This check is complaining. 

    Please try build_binary and moses from the same build.  If that
doesn't work, please send me the first kilobyte of your binary file. 

    Also, if you have Boost, can you cd kenlm && make clean && ./test.sh
and complain if there are any test failures? 

Kenneth

On 10/10/11 17:31, marco turchi wrote:
> Hi Kenneth,
> which number shall I use in the moses.ini 8 or 9 if i build my lm with
> these parameters -q 8 -b 8?
>
> I got this error when I run moses:
> In LanguageModelKen::Load: nGramOrder = 5 will be ignored.  Using
> whatever the file has.
> terminate called after throwing an instance of 'lm::FormatLoadException'
>   what():  File looks like it should be loaded with mmap, but the test
> values don't match.  Was it built on a different machine or with a
> different compiler?
>
> I have the feeling that my moses version needs to be updated!
>
> Thanks a lot
> Marco
>
> On Sat, Oct 8, 2011 at 1:02 PM, marco turchi <[email protected]
> <mailto:[email protected]>> wrote:
>
>     Thanks!
>     I'm going to update my version.
>
>     Cheers
>     Marco
>
>
>     On Sat, Oct 8, 2011 at 1:01 PM, Kenneth Heafield
>     <[email protected] <mailto:[email protected]>> wrote:
>
>         Fixed in revision 4314.  There's still an issue with some
>         SRILM models failing to build that I'll get to soon. 
>
>         On 10/08/11 11:52, marco turchi wrote:
>>         Hi,
>>         thanks a lot for the answer.
>>         Great, so I can use -m 2048 to build it. Do you think it is
>>         enough?
>>
>>         Thanks again
>>         Marco
>>
>>         On Sat, Oct 8, 2011 at 12:46 PM, Kenneth Heafield
>>         <[email protected] <mailto:[email protected]>> wrote:
>>
>>             Hi,
>>
>>                 This looks like a bug in the trie implementation due
>>             to some recent changes I made for left state
>>             minimization.  I'll fix it soon.  A workaround is to pass
>>             a large -m option to build_binary. 
>>
>>             Sorry,
>>
>>             Kenneth
>>
>>
>>             On 10/08/11 11:34, marco turchi wrote:
>>>             Dear All,
>>>             I'm trying to build a lm using a large dataset (> 11 M
>>>             sentences). I have generated the Arpa format with irstlm
>>>             and now I'd like to binarize it using kenlm.
>>>
>>>             I have called the build_binary to estimate memory usage,
>>>             and I got this
>>>
>>>             Memory estimate:
>>>             type       MB
>>>             probing 16129 assuming -p 1.5
>>>             trie     7462 without quantization
>>>             trie     4361 assuming -q 8 -b 8 quantization
>>>             trie     6440 assuming -a 22 array pointer compression
>>>             trie     3339 assuming -a 22 -q 8 -b 8 array pointer
>>>             compression and quantization
>>>
>>>             then I run the binarization in this way:
>>>
>>>             /nfs/staging/turchmo/moses/kenlmNew/build_binary -i -t
>>>             /tmp/ -q 8 -b 8 trie irstLM.ARPA.txt
>>>             irstLanguageModel.binary.lm
>>>
>>>             but I got this error:
>>>
>>>             lm/search_trie.cc:409 in void
>>>             lm::ngram::trie::<unnamed>::SanityCheckCounts(const
>>>             std::vector<long unsigned int, std::allocator<long
>>>             unsigned int> >&, const std::vector<long unsigned int,
>>>             std::allocator<long unsigned int> >&) threw
>>>             util::Exception'.
>>>             Longest count should be constant but it changed from
>>>             289546423 to 289546405 Byte: 37297517525
>>>
>>>             I have had a look into the mailing list, but I do not
>>>             find any post with the same error.
>>>
>>>             Any ideas?
>>>
>>>             Thanks a lot
>>>             Marco
>>>
>>>
>>>             _______________________________________________
>>>             Moses-support mailing list
>>>             [email protected] <mailto:[email protected]>
>>>             http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>             _______________________________________________
>>             Moses-support mailing list
>>             [email protected] <mailto:[email protected]>
>>             http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to