Re: [Moses-support] loading time for large LMs

Hieu Hoang Tue, 12 Apr 2016 00:59:28 -0700

There's been reports of problems when using memory mapping with distributed
file systems. I'm not sure if it applies to all network FS. You should
definitely try it on local disk.




Hieu Hoang
http://www.hoang.co.uk/hieu

On 12 April 2016 at 11:26, Jorg Tiedemann <[email protected]> wrote:

>
> No, it’s definitely not waiting for input … the same setup works for
> smaller models.
>
> I have the models on a work partition on our cluster.
> This is probably not good enough and I will try to move data to local tmp
> on the individual nodes before executing.
> Hopefully this helps. How would you do this if you want to distribute
> tuning?
>
> Thanks!
> Jörg
>
>
>
>
>
> On 12 Apr 2016, at 09:34, Ondrej Bojar <[email protected]> wrote:
>
> Random suggestion: isn't it waiting for stdin for some strange reason? ;-)
>
> O.
>
>
> On April 12, 2016 8:20:46 AM CEST, Hieu Hoang <[email protected]> wrote:
>
> I assume that it's on local disk rather than a network drive.
>
> Are you sure it's still in the loading stage, and that it's loading
> kenlm,
> rather than the pt or lexicalized reordering model etc?
>
> If there's a way to make the model files available for download or to
> give
> me access your machine, i might be able to debug it
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
> On 12 Apr 2016 08:41, "Jorg Tiedemann" <[email protected]> wrote:
>
>
> Unfortunately, load=read didn’t help. It’s been loading for 7 hours
>
> now
>
> and no sign to start decoding.
> The disk is not terribly slow. cat worked without problem. I don’t
>
> know
>
> what to do but I think that I have to give up for now.
> Am I the only one who is experiencing such slow loading times?
>
> Thanks again for your help!
>
> Jörg
>
>
>
>
>
> On 10 Apr 2016, at 22:27, Kenneth Heafield <[email protected]>
>
> wrote:
>
>
> With load=read:
>
> Act like normal RAM as part of the Moses process.
>
> Supports huge pages via transparent huge pages, so it's slightly
>
> faster.
>
>
> Before loading cat file >/dev/null will just put things into cache
>
> that
>
> were going to be read more or less like cat anyway.
>
> After loading cat file >/dev/null will hurt since there's the
>
> potential
>
> to load the file into RAM twice and swap out bits of Moses.
>
> Memory is shared between threads, just not with the disk cache (ok
> maybe, but only if they get huge pages support to work well) or other
> processes that independently read the file.
>
> With load=populate:
>
> Load upfront, map it into the process, kernel seems to evict it
>
> first.
>
>
> Before loading cat file >/dev/null might help, but in theory
> MAP_POPULATE should be doing much the same thing.
>
> After loading or during slow loading cat file >/dev/null can help
> because it forces the data back into RAM.  This is particularly
>
> useful
>
> if the Moses process came under memory pressure after loading, which
>
> can
>
> include heavy disk activity even if RAM isn't full.
>
> Memory is shared with all other processes that mmap.
>
> With load=lazy:
>
> Map into the process with lazy loading (i.e. mmap without
>
> MAP_POPULATE).
>
> Not recommended for decoding, but useful if you've got a 6 TB file
>
> and
>
> want to send it a few 1000 queries.
>
> cat will definitely help here at any time.
>
> Memory is shared with all other processes that mmap.
>
> On 04/10/2016 06:50 PM, Jorg Tiedemann wrote:
>
> Thanks for the quick reply.
> I will try the load option.
>
> Quick question: You said that the memory will not be shared across
> processes with that option. Does that mean that it will load the LM
>
> for
>
> each thread? That would mean a lot in my setup.
>
> By the way, I also did the cat >/dev/null thing but I didn’t have the
> impression that this changed a lot. Does it really help and how much
> would you usually gain? Thanks again!
>
>
> Jörg
>
>
> On 10 Apr 2016, at 12:55, Kenneth Heafield <[email protected]
> <mailto:[email protected] <[email protected]> <[email protected]>>>
> wrote:
>
> Hi,
>
> I'm assuming you have enough RAM to fit everything.  The kernel seems
> to preferentially evict mmapped pages as memory usage approaches full
> (it doesn't have to be full).  To work around this, use
>
> load=read
>
> in your moses.ini line for the models.  REMOVE any "lazyken" argument
> which is deprecated and might override the load= argument.
>
> The effect of load=read is to malloc (ok, anonymous mmap which is how
> malloc is implemented anyway) at a 1 GB aligned address (to optimize
>
> for
>
> huge pages) and read() the file into that memory.  It will no longer
> share across processes, but memory will have the same swapiness as
>
> the
>
> rest of the Moses process.
>
> Lazy loading will only make things worse here.
>
> Kenneth
>
> On 04/10/2016 07:29 AM, Jorg Tiedemann wrote:
>
> Hi,
>
> I have a large language model from the common crawl data set and it
> takes forever to load when running moses.
> My model is a trigram kenlm binarized with quantization, trie
>
> structures
>
> and pointer compression (-a 22 -q 8 -b 8).
> The model is about 140GB and it takes hours to load (I’m still
>
> waiting).
>
> I run on a machine with 256GB RAM ...
>
> I also tried lazy loading without success. Is this normal or do I do
> something wrong?
> Thanks for your help!
>
> Jörg
>
>
>
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected] <mailto:[email protected]
> <[email protected]>
> <[email protected]>>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> [email protected] <mailto:[email protected]
> <[email protected]>
> <[email protected]>>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> --
> Ondrej Bojar (mailto:[email protected] <[email protected]> / [email protected])
> http://www.cuni.cz/~obo
>
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] loading time for large LMs

Reply via email to