Hi Ulrich,
Sorry for sending the doubts to you directly.  I will keep in mind to post
my future queries to moses-support.

Thanks a lot for the clarification. Let me play with the MERT.

Thanks and regards,
sandipan

On 24 October 2014 01:51, Ulrich Germann <[email protected]> wrote:

> Hi Sandipan,
>
> first, please post Moses-related questions to [email protected], not
> individual contributors.
>
> second, the current seven features used by Mmsapt /
> PhraseDictionaryBitextSampling are (for details, see my recent paper on
> this phrase table implementation:
> https://www.researchgate.net/publication/267270863_Dynamic_Phrase_Tables_for_Machine_Translation_in_an_Interactive_Post-editing_Scenario
> )
>
> THE STANDARD SET OF FEATURES MAY CHANGE AT ANY TIME, as this is still work
> in progress.
>
> - forward and backward lexically smoothed phrase scores (2 scores; same as
> standard features)
> - rarity penalty (1/(x+1)), where x is the number of phrase pair
> occurrences in the corpus/sample (1 score)
> - the lower bound on forward and backward phrase-level probabilities, with
> confidence level .99 (2 scores)
> - 2 provenance features (x/(x+1)), where x is the number of phrase pair
> occurrences in the (static) background and (dynamic) foreground corpus (2
> scores)
>
> third, you need to retrain the feature weights for good performance with
> any of the standard techniques, but with the  I usually use MERT. The
> executable simulate-pe allows you to feed in references and  word
> aligmnents one sentence at a time; there are additional parameters
> --spe-src, --spe-trg, --spe-aln to specify source, target, and alignment
> (symal output format). Source and target files are one sentence per line,
> tokenized. Michael Denkowski is currently in the process of integrating
> online tuning into Moses, but I'm not sure whether that's ready to be
> deployed yet.
>
> Regards - Uli
>
>
>
> On Thu, Oct 23, 2014 at 1:47 AM, Sandipan Dandapat <
> [email protected]> wrote:
>
>> Dear Ulrich,
>> I got your reference from Prashanta Mathur. I am a postdoctoral
>> researcher in CNGL, DCU and  I am working with Moses incremental
>> retraining. It will be great if you help me to understand couple of doubts:
>>
>> 1. I found there are 7 weights to define for PT0 (PT0 is the Mmsapt name)
>> i.e.
>>
>> Mmsapt name=PT0 output-factor=0 num-features=7
>> base=/home/sandipan/inc_retrain/MT_sys/En-Fr/dgt/50_i/mmsa_pt/train. L1=en
>> L2=fr
>> [weight]
>> PT0= 0.1 0.2 0.3 0.4 0.5 0.6 0.7
>>
>> num-featues in PBSMT model is 4 which does not work with Mmsapt. What are
>> these 7 weights? Can I use uniform weights for all 7 features? Or how do I
>> adjust these values? Or, how to adjust these weights?
>>
>> 2. I found there is significant difference in BLEU score when I am using
>> standard PBSMT model and when I am using MMST based model. Is this because
>> of the weights I am using or am I doing something wrong?
>>
>> It will be real great help, if you help me to understand the above issue.
>> Thanking you.
>>
>> Regards,
>> sandipan
>>
>>
>>
>
>
> --
> Ulrich Germann
> Research Associate
> School of Informatics
> University of Edinburgh
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to