Re: [Moses-support] decoding a confusion network using Moses' API
wild guessing here: in TranslationTask::Run, I see there are many alternatives for processing the sentence, like doLatticeMBR etc, not just runing Manager::ProcessSentence() Maybe one of these alternatives must be run for processing confusion networks? cheers Sylvain On 26/04/12 15:53, Sylvain Raybaud wrote: Hi Barrow Thanks for the tip, that sounds likely indeed. I'll try it again but last time I ran the software through valgrind, I got so many errors in external libs that I just gave up. In the meantime, here is the complete fonction that handles the decoding, in case someone sees something obviously wrong in here... static void moses_translate_phonemes(manager_data_t * pool, translation_pair_t * pair) { debug(starting); const TranslationSystem system = StaticData::Instance().GetTranslationSystem(TranslationSystem::DEFAULT); /* there is only one translation system for now */ const StaticData staticData = StaticData::Instance(); const vectorFactorType inputFactorOrder = staticData.GetInputFactorOrder(); MyConfusionNet * cn = phonemes_to_cn(pool-mp_engine-phonemes_cm,pair-source-phonemes,pool-mp_config-cn_width,pool-mp_config-cn_thresh,inputFactorOrder); Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(), system); manager-ProcessSentence(); const Hypothesis* hypo = manager-GetBestHypothesis(); string hyp = moses_get_hyp(hypo); char * hyp_ret = (char*)malloc((strlen(hyp.c_str())+1)*sizeof(char)); strcpy(hyp_ret,hyp.c_str()); pair-translation_score = UntransformScore(hypo-GetScore()); translation_pair_set_target(pair, hyp_ret,NULL); delete manager; delete cn; } cheers, Sylvain On 26/04/12 13:49, Barry Haddow wrote: Hi Sylvain I'm not familiar with this part of the code, but the strange score suggests that there's some uninitialised memory. You could try running through valgrind and it might give some clues, cheers - Barry On Thursday 26 Apr 2012 12:24:11 Sylvain Raybaud wrote: Hi all I'm using Moses API for decoding a confusion network. The CN is created from the output of an ASR engine and a confusion matrix. More precisely (even though it's probably irrelevant to my problem), the ASR engine provides a string of phonemes (1-best) and the confusion matrix provides alternatives for each phonemes (the idea was described in Jiang et al., _Phonetic representation based speech translation_, MT Summit XIII, 2011). When the CN is dumped into a file and I use moses -f moses.phonemes.cn.ini CN to decode it, everything is fine. But when I use Moses API (loading the same configuration file), I get incomplete translations, like: ASR output (French): nous font sont toujours chimistes plume rassembleront ch je trouve que le office de ce tout de suite Phonetic representation: n u f on s on t t u ge u r ch i m i s t z p l y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d swa s swa t u d s h i t Translation: of score: 903011968.00 Note that the transcription is poor (I haven't really tuned the ASR engine), but still, the translation ought to be more than just of. Sometimes it's several words, I guess it's a phrase in the phrase table. The word generally seems to be the translation of a word in the source sentence. When I use moses on command line to translate either the 1-best or the the CN, I get a reasonable translation. When I use the API to translate the 1-best phonetic representation, I also get a reasonable translation. I think the CN object is created correctly because moses loads it and prints it prior to decoding (this is normal verbose behavior). I also tried to create a PCN object, and got exactly the same results. So I guess the problem is either how I tell moses to decode it or how I extract the result from the Hypothesis object. But I'm clueless about what's the problem is here, since the code is working when I just translate a string. The translation score seems ridiculously high too. I'll give below the corresponding code. Decoding and hypothesis extraction: *** [...] Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(), system); manager-ProcessSentence(); const Hypothesis* hypo = manager-GetBestHypothesis(); string hyp = moses_get_hyp(hypo); [...] pair-translation_score = UntransformScore(hypo-GetScore()); [...] string moses_get_hyp(const Hypothesis* hypo) { return hypo-GetTargetPhraseStringRep(); } Creation of the CN: *** /** new class derived from ConfusionNet, with a new method for directly creating CN */ class MyConfusionNet : public ConfusionNet { public: void addCol(Column); }; void MyConfusionNet::addCol(Column col) { data.push_back(col); } /** create a column of the CN */ static MyConfusionNet::Column create_phoneme_col(confusion_matrix_t * cm, const char * ph, int
[Moses-support] Higher BLEU/METEOR score than usual for EN-DE
Hi all, I'm running some experiments for my thesis and I've been told by a more experienced user that the achieved scores for BLEU/METEOR of my MT engine were too good to be true. Since this is the very first MT engine I've ever made and I am not experienced with interpreting scores, I really don't know how to reflect them. The first test set achieves a BLEU score of 0.6508 (v13). METEOR's final score is 0.7055 (v1.3, exact, stem, paraphrase). A second test set indicated a slightly lower BLEU score of 0.6267 and a METEOR score of 0.6748. Here are some basic facts about my system: Decoding direction: EN-DE Training corpus: 1.8 mil sentences Tuning runs: 5 Test sets: a) 2,000 sentences, b) 1,000 sentences (both in-domain) LM type: trigram TM type: unfactored I'm now trying to figure out if these scores are realistic at all, as different papers indicate by far lower BLEU scores, e.g. Koehn and Hoang 2011. Any comments regarding the mentioned decoding direction and related scores will be much appreciated. Best, Daniel ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Higher BLEU/METEOR score than usual for EN-DE
Hi Daniel BLEU scores do vary according to test set, but the scores you report are much higher than usual. The most likely thing is that you have some of your test set included in your training set, cheers - Barry On Thursday 26 April 2012 19:18:33 Daniel Schaut wrote: Hi all, I'm running some experiments for my thesis and I've been told by a more experienced user that the achieved scores for BLEU/METEOR of my MT engine were too good to be true. Since this is the very first MT engine I've ever made and I am not experienced with interpreting scores, I really don't know how to reflect them. The first test set achieves a BLEU score of 0.6508 (v13). METEOR's final score is 0.7055 (v1.3, exact, stem, paraphrase). A second test set indicated a slightly lower BLEU score of 0.6267 and a METEOR score of 0.6748. Here are some basic facts about my system: Decoding direction: EN-DE Training corpus: 1.8 mil sentences Tuning runs: 5 Test sets: a) 2,000 sentences, b) 1,000 sentences (both in-domain) LM type: trigram TM type: unfactored I'm now trying to figure out if these scores are realistic at all, as different papers indicate by far lower BLEU scores, e.g. Koehn and Hoang 2011. Any comments regarding the mentioned decoding direction and related scores will be much appreciated. Best, Daniel -- Barry Haddow University of Edinburgh +44 (0) 131 651 3173 -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Higher BLEU/METEOR score than usual for EN-DE
El dj 26 de 04 de 2012 a les 20:18 +0200, en/na Daniel Schaut va escriure: Hi all, I’m running some experiments for my thesis and I’ve been told by a more experienced user that the achieved scores for BLEU/METEOR of my MT engine were too good to be true. Since this is the very first MT engine I’ve ever made and I am not experienced with interpreting scores, I really don’t know how to reflect them. The first test set achieves a BLEU score of 0.6508 (v13). METEOR’s final score is 0.7055 (v1.3, exact, stem, paraphrase). A second test set indicated a slightly lower BLEU score of 0.6267 and a METEOR score of 0.6748. Here are some basic facts about my system: Decoding direction: EN-DE Training corpus: 1.8 mil sentences Tuning runs: 5 Test sets: a) 2,000 sentences, b) 1,000 sentences (both in-domain) LM type: trigram TM type: unfactored I’m now trying to figure out if these scores are realistic at all, as different papers indicate by far lower BLEU scores, e.g. Koehn and Hoang 2011. Any comments regarding the mentioned decoding direction and related scores will be much appreciated. Did you try looking at the sentences ? -- 1,000 is few enough to eyeball them. Have you tried the same system with a different corpus ? (e.g. EuroParl). Have you checked that your test set and your training set do not intersect ? If the scores don't seem believable, then probably they aren't :) Fran ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Higher BLEU/METEOR score than usual for EN-DE
I =think= I recall that pairwise BLEU scores for human translators are usually around 0.50, so anything much better than that is indeed suspect. - JB On Apr 26, 2012, at 14:18 , Daniel Schaut wrote: Hi all, I’m running some experiments for my thesis and I’ve been told by a more experienced user that the achieved scores for BLEU/METEOR of my MT engine were too good to be true. Since this is the very first MT engine I’ve ever made and I am not experienced with interpreting scores, I really don’t know how to reflect them. The first test set achieves a BLEU score of 0.6508 (v13). METEOR’s final score is 0.7055 (v1.3, exact, stem, paraphrase). A second test set indicated a slightly lower BLEU score of 0.6267 and a METEOR score of 0.6748. Here are some basic facts about my system: Decoding direction: EN-DE Training corpus: 1.8 mil sentences Tuning runs: 5 Test sets: a) 2,000 sentences, b) 1,000 sentences (both in-domain) LM type: trigram TM type: unfactored I’m now trying to figure out if these scores are realistic at all, as different papers indicate by far lower BLEU scores, e.g. Koehn and Hoang 2011. Any comments regarding the mentioned decoding direction and related scores will be much appreciated. Best, Daniel ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Higher BLEU/METEOR score than usual for EN-DE
Very short sentences will give you high scores. Also multiple references will boost them Miles On Apr 26, 2012 8:13 PM, John D Burger j...@mitre.org wrote: I =think= I recall that pairwise BLEU scores for human translators are usually around 0.50, so anything much better than that is indeed suspect. - JB On Apr 26, 2012, at 14:18 , Daniel Schaut wrote: Hi all, I’m running some experiments for my thesis and I’ve been told by a more experienced user that the achieved scores for BLEU/METEOR of my MT engine were too good to be true. Since this is the very first MT engine I’ve ever made and I am not experienced with interpreting scores, I really don’t know how to reflect them. The first test set achieves a BLEU score of 0.6508 (v13). METEOR’s final score is 0.7055 (v1.3, exact, stem, paraphrase). A second test set indicated a slightly lower BLEU score of 0.6267 and a METEOR score of 0.6748. Here are some basic facts about my system: Decoding direction: EN-DE Training corpus: 1.8 mil sentences Tuning runs: 5 Test sets: a) 2,000 sentences, b) 1,000 sentences (both in-domain) LM type: trigram TM type: unfactored I’m now trying to figure out if these scores are realistic at all, as different papers indicate by far lower BLEU scores, e.g. Koehn and Hoang 2011. Any comments regarding the mentioned decoding direction and related scores will be much appreciated. Best, Daniel ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Merging language models with IRSTLM..?
Hi, we are currently working on a project that includes incremental training of LMs. Hence, there are plans to introduce quick adaptation in IRSTLM, but not soon. The question is indeed how often you need to adapt the LM. If you are working with large news LMs then it seems that adapting once a week is enough (you simply do not collect enough data in fewer days to significantly change the LM). If you want to continuously update the LM you can also consider using an external interpolation. You interpolate two distinct LMs, one fixed and one smaller that is continuously retrained (should be fast to do), using the interpolate-lm command (see manual). Greetings, Marcello On Apr 22, 2012, at 9:12 PM, Pratyush Banerjee wrote: Hi, I have recently been trying to create incremental adapted language models using IRSTLM. I have a in-domain data set on which the mixture adapted weights are computed using the -lm=mix option and i have a larger out-domain dataset from which i incrementally add data to create adapted LMs of different size. Currently, every time saveBIN is called, the entire lmtable is estimated and saved which makes the process slow... Is there a functionality in IRSTLM to incrementally train/save adapted Language models? Secondly, given a existing adapted language model in ARPA format (old), and another small language model built on incremental data (new), would it be safe to update the smoothed probabilities (fstar) using the following formula: c_sum(wh) = c_old(wh) + c_new(wh) f*_old(w|h)*(c_old(wh)/c_sum(wh)) + f*_new(w|h)*(c_new(wh)/c_sum(wh)) where the c_old and c_new counts are estimated from the ngram tables? Thanks and Regards, Pratyush ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support