Hi Ulrich, Sorry for sending the doubts to you directly. I will keep in mind to post my future queries to moses-support.
Thanks a lot for the clarification. Let me play with the MERT. Thanks and regards, sandipan On 24 October 2014 01:51, Ulrich Germann <[email protected]> wrote: > Hi Sandipan, > > first, please post Moses-related questions to [email protected], not > individual contributors. > > second, the current seven features used by Mmsapt / > PhraseDictionaryBitextSampling are (for details, see my recent paper on > this phrase table implementation: > https://www.researchgate.net/publication/267270863_Dynamic_Phrase_Tables_for_Machine_Translation_in_an_Interactive_Post-editing_Scenario > ) > > THE STANDARD SET OF FEATURES MAY CHANGE AT ANY TIME, as this is still work > in progress. > > - forward and backward lexically smoothed phrase scores (2 scores; same as > standard features) > - rarity penalty (1/(x+1)), where x is the number of phrase pair > occurrences in the corpus/sample (1 score) > - the lower bound on forward and backward phrase-level probabilities, with > confidence level .99 (2 scores) > - 2 provenance features (x/(x+1)), where x is the number of phrase pair > occurrences in the (static) background and (dynamic) foreground corpus (2 > scores) > > third, you need to retrain the feature weights for good performance with > any of the standard techniques, but with the I usually use MERT. The > executable simulate-pe allows you to feed in references and word > aligmnents one sentence at a time; there are additional parameters > --spe-src, --spe-trg, --spe-aln to specify source, target, and alignment > (symal output format). Source and target files are one sentence per line, > tokenized. Michael Denkowski is currently in the process of integrating > online tuning into Moses, but I'm not sure whether that's ready to be > deployed yet. > > Regards - Uli > > > > On Thu, Oct 23, 2014 at 1:47 AM, Sandipan Dandapat < > [email protected]> wrote: > >> Dear Ulrich, >> I got your reference from Prashanta Mathur. I am a postdoctoral >> researcher in CNGL, DCU and I am working with Moses incremental >> retraining. It will be great if you help me to understand couple of doubts: >> >> 1. I found there are 7 weights to define for PT0 (PT0 is the Mmsapt name) >> i.e. >> >> Mmsapt name=PT0 output-factor=0 num-features=7 >> base=/home/sandipan/inc_retrain/MT_sys/En-Fr/dgt/50_i/mmsa_pt/train. L1=en >> L2=fr >> [weight] >> PT0= 0.1 0.2 0.3 0.4 0.5 0.6 0.7 >> >> num-featues in PBSMT model is 4 which does not work with Mmsapt. What are >> these 7 weights? Can I use uniform weights for all 7 features? Or how do I >> adjust these values? Or, how to adjust these weights? >> >> 2. I found there is significant difference in BLEU score when I am using >> standard PBSMT model and when I am using MMST based model. Is this because >> of the weights I am using or am I doing something wrong? >> >> It will be real great help, if you help me to understand the above issue. >> Thanking you. >> >> Regards, >> sandipan >> >> >> > > > -- > Ulrich Germann > Research Associate > School of Informatics > University of Edinburgh >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
