Re: [Moses-support] Problem training a portuguese/chinese translator - Part 2

Kenneth Heafield Tue, 30 Oct 2012 18:37:01 -0700

Hi,

        Please see http://www.statmt.org/moses/?n=Moses.Optimize .  I just 
updated it.  Note that "memory-mapped kenlm" is the same thing as 
"on-demand" in the language model section.


Kenneth

On 10/30/12 20:48, Nelson Simao wrote:
> Phi, I looked at the tuning/tmp.* directory, and no new files were
> produced, the last the date was from 24 october, so I stopped, and
> started again the process. What's memory-mapped kenlm and on-disk
> translation tables?
>
> Hi Wilker!
> Sentences? I just know the words, so I have to get a way to count how
> many sentences...
> And the set I'm using in training, is the same at tuning, the 1/4 of my
> parallel corpus.
>
>
>
> 2012/10/30 Wilker Aziz <[email protected] <mailto:[email protected]>>
>
>     Hi Nelson,
>
>     can you tell us how many sentences do you have for the following?
>
>     a) parallel training set: this is used for phrase extraction (or
>     rule extraction in hierarchical models), here you want to have as
>     much data as you can as this is the set that will basically
>     determine how much bilingual knowledge your model has.
>
>     b) parallel tuning set: MERT iteratively optimize the translation
>     model towards maximizing an evaluation metric (e.g. BLEU) on a
>     held-out parallel data (the tuning set - which is disjoint to
>     parallel training set), the tuning set has usually something from
>     1,000 to 2,000 sentences, if you are using much more than that your
>     MERT will take way too long and you won't really get significant gains.
>
>     Cheers,
>
>     Wilker.
>
>
>
>
>
>
>     On 29 October 2012 20:31, Nelson Simao <[email protected]
>     <mailto:[email protected]>> wrote:
>
>         Hi,
>           The chinese corpus 669424 words, and the portuguese 678023 words.
>           In the terminal is running the 'mert' command.
>           Is using 87% of memory and half of Swap. Is running on a small
>         server at my college, I think it have 4Gb of swap an 2Gb of RAM.
>
>         I'm going to read that now. Thanks Philipp!
>
>
>
>
>         2012/10/29 Philipp Koehn <[email protected]
>         <mailto:[email protected]>>
>
>             Hi,
>
>             how big is your corpus in total (number of words)?
>             What step is currently processing?
>             Is there excessive memory use / swapping / etc.?
>
>             There are various ways to speed things up by multi-threading
>             or other multi-core usage.
>             Check:
>             http://www.statmt.org/moses/?n=Moses.AdvancedFeatures
>
>             -phi
>
>             On Mon, Oct 29, 2012 at 12:01 PM, Nelson Simao
>             <[email protected] <mailto:[email protected]>> wrote:
>              > Hi everyone!
>              >
>              > Now I'm having another problem in my translator. I
>             trained it with just 1/4
>              > of the corpus that I have here, tested it but the
>             translation results aren't
>              > so good how I expected. So now I'm trying to train with
>             the whole
>              > corpus(cause I think that I will get better results), but
>             the mert/moses
>              > commands are running since 21 October...8 days ago.
>              > Gotta have the translator working properly as soon as
>             possible, because it
>              > is part of a college task/work. Someone can help me with
>             the problem of the
>              > training duration, and also give me some tips to get
>             better results in the
>              > translation of pt->zn and zn->pt?
>              >
>              >
>              > Best regards!
>              > Nelson from Portugal.
>              >
>              > _______________________________________________
>              > Moses-support mailing list
>              > [email protected] <mailto:[email protected]>
>              > http://mailman.mit.edu/mailman/listinfo/moses-support
>              >
>
>
>
>         _______________________________________________
>         Moses-support mailing list
>         [email protected] <mailto:[email protected]>
>         http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
>     --
>     Wilker Aziz
>     http://pers-www.wlv.ac.uk/~in1676/
>
>     PhD candidate at The Research Group in Computational Linguistics
>     Research Institute of Information and Language Processing (RIILP)
>     University of Wolverhampton
>     MB108
>     Stafford Street
>     WOLVERHAMPTON WV1 1LY
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Problem training a portuguese/chinese translator - Part 2

Reply via email to