Re: [Moses-support] Empty nbest entry - any way to force a translation?

Jeremy Gwinnup Fri, 15 Apr 2016 06:12:49 -0700

Phillipp,

It’s actually a system for the WMT16 tuning task - we’re working with provided 
models and ini. We binarized the models for loading time, but otherwise the 
system is as-delivered.


-Jeremy

> On Apr 14, 2016, at 5:04 PM, [email protected] wrote:
> 
> Send Moses-support mailing list submissions to
>       [email protected]
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>       http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
>       [email protected]
> 
> You can reach the person managing the list at
>       [email protected]
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Empty nbest entry - any way to force a       translation?
>      (Philipp Koehn)
>   2. Re: loading time for large LMs (Philipp Koehn)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 14 Apr 2016 16:53:11 -0400
> From: Philipp Koehn <[email protected]>
> Subject: Re: [Moses-support] Empty nbest entry - any way to force a
>       translation?
> To: Hieu Hoang <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Message-ID:
>       <CAAFADDCibj=OFoO=6gxwazlrbb5x9x7vk6m8bjgyp4js-au...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Hi,
> 
> is there any way to track down why it does not produce a translation for
> the sentence?
> This really should not happen in the phrase-based model...
> 
> -phi
> 
> On Thu, Apr 14, 2016 at 10:05 AM, Hieu Hoang <[email protected]> wrote:
> 
>> 
>> if you're decoding with the normal pb algorithm, there's a an argument
>>    [stack_diversity]
>>    +positve number
>> I'm not exactly sure what this does but I think it makes sure that all
>> coverages has some hypos. This will reduce the risk of the decoder
>> creating only dead ends.
>> 
>> On 14/04/2016 17:31, Hieu Hoang wrote:
>>> You can probably do it in Manager.cpp line 1583. But you'll need to
>>> fill out the columns for total scores, and individual ff scores, which
>>> is gonna be arbitary.
>>> 
>>> Prob best if this is handled downstream.
>>> 
>>> btw, why is it getting no entries? Is there walls/zones in your input?
>>> Are the pruning harsher than normal?
>>> 
>>> On 13/04/2016 23:57, Jeremy Gwinnup wrote:
>>>> Hi,
>>>> 
>>>> We?re running a system which unfortunately produces no entries for a
>>>> sentence in a nbest list. Is there any way to force output of a
>>>> translation in the nbest list even if it?s the original sentence ?
>>>> 
>>>> Thanks!
>>>> -Jeremy
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected]
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> 
>> 
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20160414/9f4e8e19/attachment-0001.html
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 14 Apr 2016 17:04:16 -0400
> From: Philipp Koehn <[email protected]>
> Subject: Re: [Moses-support] loading time for large LMs
> To: Ondrej Bojar <[email protected]>
> Cc: Jorg Tiedemann <[email protected]>,      "[email protected]"
>       <[email protected]>
> Message-ID:
>       <CAAFADDC6eMbEA-46TAbTCRE9foJM=krfyucf0tb3ajc9p-k...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Hi,
> 
> I recently added to experiment.perl an option to first copy all big model
> files to local disk before running the decoder.
> 
> For this, you just need to set the parameter
> cache-model = "/scratch/disk/path"
> in the [GENERAL] section.
> 
> This works well in our GridEngine setup.
> 
> -phi
> 
> On Tue, Apr 12, 2016 at 9:03 AM, Ondrej Bojar <[email protected]>
> wrote:
> 
>> Hi,
>> 
>> back to your question on getting the files on local disks where tuning
>> jobs will run: This was never easy with the current implementation, but in
>> fact, with multithreaded moses, the benefit of parallelizing across nodes
>> is vanishing.
>> 
>> So I'd pass some queue-parameters to force the job to land on one of a
>> very few nodes that will have the files already there.
>> 
>> Also, we have all our temps cross-mounted, so what I sometimes do is to
>> let the job run anywhere but take the data from the local temp of another
>> fixed machine. Yes, this is wasting network but relieving the flooded (or
>> incapable) main file server.
>> 
>> Cheers, O.
>> 
>> ----- Original Message -----
>>> From: "Jorg Tiedemann" <[email protected]>
>>> To: "Kenneth Heafield" <[email protected]>
>>> Cc: [email protected]
>>> Sent: Tuesday, 12 April, 2016 14:45:57
>>> Subject: Re: [Moses-support] loading time for large LMs
>> 
>>> Well, this is on a shared login node and maybe not very representative
>> for other
>>> nodes in the cluster.
>>> I can see if I can get a more representative figure.
>>> But it?s quite busy on our cluster right now ?.
>>> 
>>> 
>>> All the best,
>>> J?rg
>>> 
>>> 
>>> J?rg Tiedemann
>>> [email protected]
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On 12 Apr 2016, at 14:54, Kenneth Heafield <[email protected]> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>>     Why is your system using 7 GB of swap out of 9 GB?  Moses is only
>>>> taking 147 GB out of 252 GB physical.  I smell other processes taking up
>>>> RAM, possibly those 5 stopped and 1 zombie.
>>>> 
>>>> Kenneth
>>>> 
>>>> On 04/12/2016 12:45 PM, Jorg Tiedemann wrote:
>>>>> 
>>>>>> 
>>>>>> Did you remove all "lazyken" arguments from moses.ini?
>>>>> 
>>>>> Yes, I did.
>>>>> 
>>>>>> 
>>>>>> Is the network filesystem Lustre?  If so, mmap will perform terribly
>> and
>>>>>> you should use load=read or (better) load=parallel_read since reading
>>>>>> from Lustre is CPU-bound.
>>>>>> 
>>>>> 
>>>>> Yes, I think so. Interesting with the parallel_read option. Can this
>>>>> hurt for some setups or could I use this as my standard?
>>>>> 
>>>>> 
>>>>>> Does the cluster management software/job scheduler/sysadmin impose a
>>>>>> resident memory limit?
>>>>>> 
>>>>> 
>>>>> I don?t really know. I don?t really think so but I need to find out
>>>>> 
>>>>> 
>>>>>> Can you copy-paste `top' when it's running slow and the stderr at that
>>>>>> time?
>>>>> 
>>>>> 
>>>>> Here is top of my top when running on my test node:
>>>>> 
>>>>> top - 14:39:03 up 50 days,  5:47,  0 users,  load average: 1.97, 2.09,
>> 3.85
>>>>> Tasks: 814 total,   3 running, 805 sleeping,   5 stopped,   1 zombie
>>>>> Cpu(s):  6.9%us,  6.2%sy,  0.0%ni, 86.9%id,  0.0%wa,  0.0%hi,  0.0%si,
>>>>> 0.0%st
>>>>> Mem:  264493500k total, 263614188k used,   879312k free,    68680k
>> buffers
>>>>> Swap:  9775548k total,  7198920k used,  2576628k free, 69531796k cached
>>>>> 
>>>>> PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>>> 42528 tiedeman  20   0  147g 147g  800 R 100.0 58.4  31:25.01 moses
>>>>> 
>>>>> stderr doesn?t say anything new besides of the message from starting
>> the
>>>>> feature function loading
>>>>> 
>>>>> FeatureFunction: LM0 start: 16 end: 16
>>>>> line=KENLM load=parallel_read name=LM1 factor=0
>>>>> 
>> path=/homeappl/home/tiedeman/research/SMT/wmt16/fi-en/data/monolingual/cc.tok.3.en.trie.kenlm
>>>>> order=3
>>>>> 
>>>>> 
>>>>> I try with /tmp/ now as well (it takes time to shuffle around the big
>>>>> files though).
>>>>> 
>>>>> J?rg
>>>>> 
>>>>> 
>>>>>> 
>>>>>> On 04/12/2016 08:26 AM, Jorg Tiedemann wrote:
>>>>>>> 
>>>>>>> No, it?s definitely not waiting for input ? the same setup works for
>>>>>>> smaller models.
>>>>>>> 
>>>>>>> I have the models on a work partition on our cluster.
>>>>>>> This is probably not good enough and I will try to move data to local
>>>>>>> tmp on the individual nodes before executing.
>>>>>>> Hopefully this helps. How would you do this if you want to distribute
>>>>>>> tuning?
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> J?rg
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On 12 Apr 2016, at 09:34, Ondrej Bojar <[email protected]
>>>>>>>> <mailto:[email protected]>
>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>>
>> wrote:
>>>>>>>> 
>>>>>>>> Random suggestion: isn't it waiting for stdin for some strange
>>>>>>>> reason? ;-)
>>>>>>>> 
>>>>>>>> O.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On April 12, 2016 8:20:46 AM CEST, Hieu Hoang <[email protected]
>>>>>>>> <mailto:[email protected]>
>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>>>>>>>>> I assume that it's on local disk rather than a network drive.
>>>>>>>>> 
>>>>>>>>> Are you sure it's still in the loading stage, and that it's loading
>>>>>>>>> kenlm,
>>>>>>>>> rather than the pt or lexicalized reordering model etc?
>>>>>>>>> 
>>>>>>>>> If there's a way to make the model files available for download or
>> to
>>>>>>>>> give
>>>>>>>>> me access your machine, i might be able to debug it
>>>>>>>>> 
>>>>>>>>> Hieu Hoang
>>>>>>>>> http://www.hoang.co.uk/hieu <http://www.hoang.co.uk/hieu>
>>>>>>>>> On 12 Apr 2016 08:41, "Jorg Tiedemann" <[email protected]
>>>>>>>>> <mailto:[email protected]>
>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Unfortunately, load=read didn?t help. It?s been loading for 7
>> hours
>>>>>>>>> now
>>>>>>>>>> and no sign to start decoding.
>>>>>>>>>> The disk is not terribly slow. cat worked without problem. I don?t
>>>>>>>>> know
>>>>>>>>>> what to do but I think that I have to give up for now.
>>>>>>>>>> Am I the only one who is experiencing such slow loading times?
>>>>>>>>>> 
>>>>>>>>>> Thanks again for your help!
>>>>>>>>>> 
>>>>>>>>>> J?rg
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 10 Apr 2016, at 22:27, Kenneth Heafield <[email protected]
>>>>>>>>>> <mailto:[email protected]>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> With load=read:
>>>>>>>>>> 
>>>>>>>>>> Act like normal RAM as part of the Moses process.
>>>>>>>>>> 
>>>>>>>>>> Supports huge pages via transparent huge pages, so it's slightly
>>>>>>>>> faster.
>>>>>>>>>> 
>>>>>>>>>> Before loading cat file >/dev/null will just put things into cache
>>>>>>>>> that
>>>>>>>>>> were going to be read more or less like cat anyway.
>>>>>>>>>> 
>>>>>>>>>> After loading cat file >/dev/null will hurt since there's the
>>>>>>>>> potential
>>>>>>>>>> to load the file into RAM twice and swap out bits of Moses.
>>>>>>>>>> 
>>>>>>>>>> Memory is shared between threads, just not with the disk cache (ok
>>>>>>>>>> maybe, but only if they get huge pages support to work well) or
>> other
>>>>>>>>>> processes that independently read the file.
>>>>>>>>>> 
>>>>>>>>>> With load=populate:
>>>>>>>>>> 
>>>>>>>>>> Load upfront, map it into the process, kernel seems to evict it
>>>>>>>>> first.
>>>>>>>>>> 
>>>>>>>>>> Before loading cat file >/dev/null might help, but in theory
>>>>>>>>>> MAP_POPULATE should be doing much the same thing.
>>>>>>>>>> 
>>>>>>>>>> After loading or during slow loading cat file >/dev/null can help
>>>>>>>>>> because it forces the data back into RAM.  This is particularly
>>>>>>>>> useful
>>>>>>>>>> if the Moses process came under memory pressure after loading,
>> which
>>>>>>>>> can
>>>>>>>>>> include heavy disk activity even if RAM isn't full.
>>>>>>>>>> 
>>>>>>>>>> Memory is shared with all other processes that mmap.
>>>>>>>>>> 
>>>>>>>>>> With load=lazy:
>>>>>>>>>> 
>>>>>>>>>> Map into the process with lazy loading (i.e. mmap without
>>>>>>>>> MAP_POPULATE).
>>>>>>>>>> Not recommended for decoding, but useful if you've got a 6 TB file
>>>>>>>>> and
>>>>>>>>>> want to send it a few 1000 queries.
>>>>>>>>>> 
>>>>>>>>>> cat will definitely help here at any time.
>>>>>>>>>> 
>>>>>>>>>> Memory is shared with all other processes that mmap.
>>>>>>>>>> 
>>>>>>>>>> On 04/10/2016 06:50 PM, Jorg Tiedemann wrote:
>>>>>>>>>> 
>>>>>>>>>> Thanks for the quick reply.
>>>>>>>>>> I will try the load option.
>>>>>>>>>> 
>>>>>>>>>> Quick question: You said that the memory will not be shared across
>>>>>>>>>> processes with that option. Does that mean that it will load the
>> LM
>>>>>>>>> for
>>>>>>>>>> each thread? That would mean a lot in my setup.
>>>>>>>>>> 
>>>>>>>>>> By the way, I also did the cat >/dev/null thing but I didn?t have
>> the
>>>>>>>>>> impression that this changed a lot. Does it really help and how
>> much
>>>>>>>>>> would you usually gain? Thanks again!
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> J?rg
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 10 Apr 2016, at 12:55, Kenneth Heafield <[email protected]
>>>>>>>>>> <mailto:[email protected]>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]> <
>> [email protected]
>>>>>>>>>> <mailto:[email protected]>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>>>>
>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I'm assuming you have enough RAM to fit everything.  The kernel
>> seems
>>>>>>>>>> to preferentially evict mmapped pages as memory usage approaches
>> full
>>>>>>>>>> (it doesn't have to be full).  To work around this, use
>>>>>>>>>> 
>>>>>>>>>> load=read
>>>>>>>>>> 
>>>>>>>>>> in your moses.ini line for the models.  REMOVE any "lazyken"
>> argument
>>>>>>>>>> which is deprecated and might override the load= argument.
>>>>>>>>>> 
>>>>>>>>>> The effect of load=read is to malloc (ok, anonymous mmap which is
>> how
>>>>>>>>>> malloc is implemented anyway) at a 1 GB aligned address (to
>> optimize
>>>>>>>>> for
>>>>>>>>>> huge pages) and read() the file into that memory.  It will no
>> longer
>>>>>>>>>> share across processes, but memory will have the same swapiness as
>>>>>>>>> the
>>>>>>>>>> rest of the Moses process.
>>>>>>>>>> 
>>>>>>>>>> Lazy loading will only make things worse here.
>>>>>>>>>> 
>>>>>>>>>> Kenneth
>>>>>>>>>> 
>>>>>>>>>> On 04/10/2016 07:29 AM, Jorg Tiedemann wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I have a large language model from the common crawl data set and
>> it
>>>>>>>>>> takes forever to load when running moses.
>>>>>>>>>> My model is a trigram kenlm binarized with quantization, trie
>>>>>>>>> structures
>>>>>>>>>> and pointer compression (-a 22 -q 8 -b 8).
>>>>>>>>>> The model is about 140GB and it takes hours to load (I?m still
>>>>>>>>> waiting).
>>>>>>>>>> I run on a machine with 256GB RAM ...
>>>>>>>>>> 
>>>>>>>>>> I also tried lazy loading without success. Is this normal or do I
>> do
>>>>>>>>>> something wrong?
>>>>>>>>>> Thanks for your help!
>>>>>>>>>> 
>>>>>>>>>> J?rg
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Moses-support mailing list
>>>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>
>>>>>>>>>> <[email protected] <mailto:[email protected]>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>>>
>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>>>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Moses-support mailing list
>>>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>
>>>>>>>>>> <[email protected] <mailto:[email protected]>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>>>
>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>>>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Moses-support mailing list
>>>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>>>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>> ------------------------------------------------------------------------
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> Moses-support mailing list
>>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Ondrej Bojar (mailto:[email protected] <mailto:[email protected]> /
>> [email protected]
>>>>>>>> <mailto:[email protected]>
>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>>> <mailto:[email protected] <mailto:[email protected]>>)
>>>>>>>> http://www.cuni.cz/~obo <http://www.cuni.cz/~obo>
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Moses-support mailing list
>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>>>>>>> 
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> [email protected] <mailto:[email protected]>
>>>>>> <mailto:[email protected] <mailto:[email protected]>>
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>>>>> 
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected] <mailto:[email protected]>
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>>> 
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> 
>> --
>> Ondrej Bojar (mailto:[email protected] / [email protected])
>> http://www.cuni.cz/~obo
>> 
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20160414/f8a01bda/attachment.html
> 
> ------------------------------
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
> End of Moses-support Digest, Vol 114, Issue 39
> **********************************************


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Empty nbest entry - any way to force a translation?

Reply via email to