Hi Nelson,

can you tell us how many sentences do you have for the following?

a) parallel training set: this is used for phrase extraction (or rule
extraction in hierarchical models), here you want to have as much data as
you can as this is the set that will basically determine how much bilingual
knowledge your model has.

b) parallel tuning set: MERT iteratively optimize the translation model
towards maximizing an evaluation metric (e.g. BLEU) on a held-out parallel
data (the tuning set - which is disjoint to parallel training set), the
tuning set has usually something from 1,000 to 2,000 sentences, if you are
using much more than that your MERT will take way too long and you won't
really get significant gains.

Cheers,

Wilker.






On 29 October 2012 20:31, Nelson Simao <[email protected]> wrote:

> Hi,
>  The chinese corpus 669424 words, and the portuguese 678023 words.
>  In the terminal is running the 'mert' command.
>  Is using 87% of memory and half of Swap. Is running on a small server at
> my college, I think it have 4Gb of swap an 2Gb of RAM.
>
> I'm going to read that now. Thanks Philipp!
>
>
>
>
> 2012/10/29 Philipp Koehn <[email protected]>
>
>> Hi,
>>
>> how big is your corpus in total (number of words)?
>> What step is currently processing?
>> Is there excessive memory use / swapping / etc.?
>>
>> There are various ways to speed things up by multi-threading
>> or other multi-core usage.
>> Check:
>> http://www.statmt.org/moses/?n=Moses.AdvancedFeatures
>>
>> -phi
>>
>> On Mon, Oct 29, 2012 at 12:01 PM, Nelson Simao <[email protected]>
>> wrote:
>> > Hi everyone!
>> >
>> > Now I'm having another problem in my translator. I trained it with just
>> 1/4
>> > of the corpus that I have here, tested it but the translation results
>> aren't
>> > so good how I expected. So now I'm trying to train with the whole
>> > corpus(cause I think that I will get better results), but the mert/moses
>> > commands are running since 21 October...8 days ago.
>> > Gotta have the translator working properly as soon as possible, because
>> it
>> > is part of a college task/work. Someone can help me with the problem of
>> the
>> > training duration, and also give me some tips to get better results in
>> the
>> > translation of pt->zn and zn->pt?
>> >
>> >
>> > Best regards!
>> > Nelson from Portugal.
>> >
>> > _______________________________________________
>> > Moses-support mailing list
>> > [email protected]
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
Wilker Aziz
http://pers-www.wlv.ac.uk/~in1676/

PhD candidate at The Research Group in Computational Linguistics
Research Institute of Information and Language Processing (RIILP)
University of Wolverhampton
MB108
Stafford Street
WOLVERHAMPTON WV1 1LY
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to