Re: [Moses-support] loading time for large LMs

Ondrej Bojar Mon, 11 Apr 2016 23:39:07 -0700

Random suggestion: isn't it waiting for stdin for some strange reason? ;-)

O.



On April 12, 2016 8:20:46 AM CEST, Hieu Hoang <[email protected]> wrote:
>I assume that it's on local disk rather than a network drive.
>
>Are you sure it's still in the loading stage, and that it's loading
>kenlm,
>rather than the pt or lexicalized reordering model etc?
>
>If there's a way to make the model files available for download or to
>give
>me access your machine, i might be able to debug it
>
>Hieu Hoang
>http://www.hoang.co.uk/hieu
>On 12 Apr 2016 08:41, "Jorg Tiedemann" <[email protected]> wrote:
>
>>
>> Unfortunately, load=read didn’t help. It’s been loading for 7 hours
>now
>> and no sign to start decoding.
>> The disk is not terribly slow. cat worked without problem. I don’t
>know
>> what to do but I think that I have to give up for now.
>> Am I the only one who is experiencing such slow loading times?
>>
>> Thanks again for your help!
>>
>> Jörg
>>
>>
>>
>>
>>
>> On 10 Apr 2016, at 22:27, Kenneth Heafield <[email protected]>
>wrote:
>>
>> With load=read:
>>
>> Act like normal RAM as part of the Moses process.
>>
>> Supports huge pages via transparent huge pages, so it's slightly
>faster.
>>
>> Before loading cat file >/dev/null will just put things into cache
>that
>> were going to be read more or less like cat anyway.
>>
>> After loading cat file >/dev/null will hurt since there's the
>potential
>> to load the file into RAM twice and swap out bits of Moses.
>>
>> Memory is shared between threads, just not with the disk cache (ok
>> maybe, but only if they get huge pages support to work well) or other
>> processes that independently read the file.
>>
>> With load=populate:
>>
>> Load upfront, map it into the process, kernel seems to evict it
>first.
>>
>> Before loading cat file >/dev/null might help, but in theory
>> MAP_POPULATE should be doing much the same thing.
>>
>> After loading or during slow loading cat file >/dev/null can help
>> because it forces the data back into RAM.  This is particularly
>useful
>> if the Moses process came under memory pressure after loading, which
>can
>> include heavy disk activity even if RAM isn't full.
>>
>> Memory is shared with all other processes that mmap.
>>
>> With load=lazy:
>>
>> Map into the process with lazy loading (i.e. mmap without
>MAP_POPULATE).
>> Not recommended for decoding, but useful if you've got a 6 TB file
>and
>> want to send it a few 1000 queries.
>>
>> cat will definitely help here at any time.
>>
>> Memory is shared with all other processes that mmap.
>>
>> On 04/10/2016 06:50 PM, Jorg Tiedemann wrote:
>>
>> Thanks for the quick reply.
>> I will try the load option.
>>
>> Quick question: You said that the memory will not be shared across
>> processes with that option. Does that mean that it will load the LM
>for
>> each thread? That would mean a lot in my setup.
>>
>> By the way, I also did the cat >/dev/null thing but I didn’t have the
>> impression that this changed a lot. Does it really help and how much
>> would you usually gain? Thanks again!
>>
>>
>> Jörg
>>
>>
>> On 10 Apr 2016, at 12:55, Kenneth Heafield <[email protected]
>> <mailto:[email protected] <[email protected]>>> wrote:
>>
>> Hi,
>>
>> I'm assuming you have enough RAM to fit everything.  The kernel seems
>> to preferentially evict mmapped pages as memory usage approaches full
>> (it doesn't have to be full).  To work around this, use
>>
>> load=read
>>
>> in your moses.ini line for the models.  REMOVE any "lazyken" argument
>> which is deprecated and might override the load= argument.
>>
>> The effect of load=read is to malloc (ok, anonymous mmap which is how
>> malloc is implemented anyway) at a 1 GB aligned address (to optimize
>for
>> huge pages) and read() the file into that memory.  It will no longer
>> share across processes, but memory will have the same swapiness as
>the
>> rest of the Moses process.
>>
>> Lazy loading will only make things worse here.
>>
>> Kenneth
>>
>> On 04/10/2016 07:29 AM, Jorg Tiedemann wrote:
>>
>> Hi,
>>
>> I have a large language model from the common crawl data set and it
>> takes forever to load when running moses.
>> My model is a trigram kenlm binarized with quantization, trie
>structures
>> and pointer compression (-a 22 -q 8 -b 8).
>> The model is about 140GB and it takes hours to load (I’m still
>waiting).
>> I run on a machine with 256GB RAM ...
>>
>> I also tried lazy loading without success. Is this normal or do I do
>> something wrong?
>> Thanks for your help!
>>
>> Jörg
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected] <mailto:[email protected]
>> <[email protected]>>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected] <mailto:[email protected]
>> <[email protected]>>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Moses-support mailing list
>[email protected]
>http://mailman.mit.edu/mailman/listinfo/moses-support

-- 
Ondrej Bojar (mailto:[email protected] / [email protected])
http://www.cuni.cz/~obo

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] loading time for large LMs

Reply via email to