Hi Nelson, can you tell us how many sentences do you have for the following?
a) parallel training set: this is used for phrase extraction (or rule extraction in hierarchical models), here you want to have as much data as you can as this is the set that will basically determine how much bilingual knowledge your model has. b) parallel tuning set: MERT iteratively optimize the translation model towards maximizing an evaluation metric (e.g. BLEU) on a held-out parallel data (the tuning set - which is disjoint to parallel training set), the tuning set has usually something from 1,000 to 2,000 sentences, if you are using much more than that your MERT will take way too long and you won't really get significant gains. Cheers, Wilker. On 29 October 2012 20:31, Nelson Simao <[email protected]> wrote: > Hi, > The chinese corpus 669424 words, and the portuguese 678023 words. > In the terminal is running the 'mert' command. > Is using 87% of memory and half of Swap. Is running on a small server at > my college, I think it have 4Gb of swap an 2Gb of RAM. > > I'm going to read that now. Thanks Philipp! > > > > > 2012/10/29 Philipp Koehn <[email protected]> > >> Hi, >> >> how big is your corpus in total (number of words)? >> What step is currently processing? >> Is there excessive memory use / swapping / etc.? >> >> There are various ways to speed things up by multi-threading >> or other multi-core usage. >> Check: >> http://www.statmt.org/moses/?n=Moses.AdvancedFeatures >> >> -phi >> >> On Mon, Oct 29, 2012 at 12:01 PM, Nelson Simao <[email protected]> >> wrote: >> > Hi everyone! >> > >> > Now I'm having another problem in my translator. I trained it with just >> 1/4 >> > of the corpus that I have here, tested it but the translation results >> aren't >> > so good how I expected. So now I'm trying to train with the whole >> > corpus(cause I think that I will get better results), but the mert/moses >> > commands are running since 21 October...8 days ago. >> > Gotta have the translator working properly as soon as possible, because >> it >> > is part of a college task/work. Someone can help me with the problem of >> the >> > training duration, and also give me some tips to get better results in >> the >> > translation of pt->zn and zn->pt? >> > >> > >> > Best regards! >> > Nelson from Portugal. >> > >> > _______________________________________________ >> > Moses-support mailing list >> > [email protected] >> > http://mailman.mit.edu/mailman/listinfo/moses-support >> > >> > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- Wilker Aziz http://pers-www.wlv.ac.uk/~in1676/ PhD candidate at The Research Group in Computational Linguistics Research Institute of Information and Language Processing (RIILP) University of Wolverhampton MB108 Stafford Street WOLVERHAMPTON WV1 1LY
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
