Hi Sylvain

I think ProcessSentence() is the right method to call. If you look at moses 
server then you'll see a less cluttered example of how to use the Moses api. 
It may be your  moses_get_hyp() is not back-tracking through the hypothesis 
correctly.

Note that you are calling  UntransformScore() which probably explains your odd 
translation score. It doesn't make much sense to do this, as you won't get a 
probability (it's not normalised). It is unusual though, that you appear to 
have a positive translation score (in log space).

If you increase the verbosity of moses (to 2 or 3) you'll get a better idea 
what it is doing, and you can see whether it really is producing "of" as the 
translation, and why.

cheers - Barry

On Thursday 26 April 2012 16:41:06 Sylvain Raybaud wrote:
> wild guessing here: in TranslationTask::Run, I see there are many
> alternatives for processing the sentence, like doLatticeMBR etc, not
> just runing Manager::ProcessSentence()
> Maybe one of these alternatives must be run for processing confusion
> networks?
> 
> cheers
> 
> Sylvain
> 
> On 26/04/12 15:53, Sylvain Raybaud wrote:
> > Hi Barrow
> >
> >   Thanks for the tip, that sounds likely indeed. I'll try it again but
> > last time I ran the software through valgrind, I got so many errors in
> > external libs that I just gave up.
> >
> > In the meantime, here is the complete fonction that handles the
> > decoding, in case someone sees something obviously wrong in here...
> >
> > static void moses_translate_phonemes(manager_data_t * pool,
> > translation_pair_t * pair) {
> >     debug("starting");
> >
> >     const TranslationSystem& system =
> > StaticData::Instance().GetTranslationSystem(TranslationSystem::DEFAULT);
> > /* there is only one translation system for now */
> >     const StaticData &staticData = StaticData::Instance();
> >     const vector<FactorType> &inputFactorOrder =
> > staticData.GetInputFactorOrder();
> >
> >     MyConfusionNet * cn =
> > phonemes_to_cn(pool->mp_engine->phonemes_cm,pair->source->phonemes,pool->
> >mp_config->cn_width,pool->mp_config->cn_thresh,inputFactorOrder);
> >
> >     Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
> > &system);
> >     manager->ProcessSentence();
> >     const Hypothesis* hypo = manager->GetBestHypothesis();
> >
> >     string hyp = moses_get_hyp(hypo);
> >     char * hyp_ret = (char*)malloc((strlen(hyp.c_str())+1)*sizeof(char));
> >     strcpy(hyp_ret,hyp.c_str());
> >
> >     pair->translation_score = UntransformScore(hypo->GetScore());
> >     translation_pair_set_target(pair, hyp_ret,NULL);
> >
> >     delete manager;
> >     delete cn;
> >
> > }
> >
> > cheers,
> >
> > Sylvain
> >
> > On 26/04/12 13:49, Barry Haddow wrote:
> >> Hi Sylvain
> >>
> >> I'm not familiar with this part of the code, but the strange score
> >> suggests that there's some uninitialised memory. You could try running
> >> through valgrind and it might give some clues,
> >>
> >> cheers - Barry
> >>
> >> On Thursday 26 Apr 2012 12:24:11 Sylvain Raybaud wrote:
> >>> Hi all
> >>>
> >>>   I'm using Moses API for decoding a confusion network. The CN is
> >>> created from the output of an ASR engine and a confusion matrix. More
> >>> precisely (even though it's probably irrelevant to my problem), the ASR
> >>> engine provides a string of phonemes (1-best) and the confusion matrix
> >>> provides alternatives for each phonemes (the idea was described in
> >>> Jiang et al., _Phonetic representation based speech translation_, MT
> >>> Summit XIII, 2011).
> >>>
> >>> When the CN is dumped into a file and I use
> >>> moses -f moses.phonemes.cn.ini < CN
> >>> to decode it, everything is fine.
> >>>
> >>> But when I use Moses API (loading the same configuration file), I get
> >>> incomplete translations, like:
> >>>
> >>> ASR output (French): "nous font sont toujours chimistes plume
> >>> rassembleront ch je trouve que le office de ce tout de suite"
> >>> Phonetic representation: "n u f on s on t t u ge u r ch i m i s t z p l
> >>> y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d
> >>> swa s swa t u d s h i t"
> >>> Translation: "of"
> >>> score: 903011968.000000
> >>>
> >>> Note that the transcription is poor (I haven't really tuned the ASR
> >>> engine), but still, the translation ought to be more than just "of".
> >>> Sometimes it's several words, I guess it's a phrase in the phrase
> >>> table. The word generally seems to be the translation of a word in the
> >>> source sentence.
> >>> When I use moses on command line to translate either the 1-best or the
> >>> the CN, I get a reasonable translation. When I use the API to translate
> >>> the 1-best phonetic representation, I also get a reasonable
> >>> translation. I think the CN object is created correctly because moses
> >>> loads it and prints it prior to decoding (this is normal verbose
> >>> behavior). I also tried to create a PCN object, and got exactly the
> >>> same results. So I guess the problem is either how I tell moses to
> >>> decode it or how I extract the result from the Hypothesis object. But
> >>> I'm clueless about what's the problem is here, since the code is
> >>> working when I just translate a string. The translation score seems
> >>> ridiculously high too. I'll give below the corresponding code.
> >>>
> >>> Decoding and hypothesis extraction:
> >>> ***********************************
> >>> [...]
> >>> Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
> >>> &system);
> >>> manager->ProcessSentence();
> >>> const Hypothesis* hypo = manager->GetBestHypothesis();
> >>> string hyp = moses_get_hyp(hypo);
> >>> [...]
> >>> pair->translation_score = UntransformScore(hypo->GetScore());
> >>> [...]
> >>>
> >>> string moses_get_hyp(const Hypothesis* hypo) {
> >>>     return hypo->GetTargetPhraseStringRep();
> >>> }
> >>>
> >>>
> >>> Creation of the CN:
> >>> *******************
> >>>
> >>> /** new class derived from ConfusionNet, with a new method for directly
> >>> creating CN */
> >>> class MyConfusionNet : public ConfusionNet {
> >>>   public:
> >>>     void addCol(Column);
> >>> };
> >>>
> >>> void MyConfusionNet::addCol(Column col) {
> >>>     data.push_back(col);
> >>> }
> >>>
> >>> /** create a column of the CN */
> >>> static MyConfusionNet::Column create_phoneme_col(confusion_matrix_t *
> >>> cm, const char * ph, int width, double thresh, const vector<FactorType>
> >>> &factor_order) {
> >>>
> >>>     MyConfusionNet::Column col;
> >>>
> >>>     phoneme_conf_t * ph_conf =
> >>> (phoneme_conf_t*)g_hash_table_lookup(cm->matrix,ph);
> >>>     if(ph_conf==NULL) {
> >>>         return col;
> >>>     }
> >>>
> >>>     int i;
> >>>     for(i = 0; i<cm->n_phonemes; i++) {
> >>>         vector<float> scores;
> >>>         float score = float(ph_conf[i].p);
> >>>         if((width<=0 || i<width) && (thresh<=0 || score>=thresh)) {
> >>>             string wd(cm->phonemes[ph_conf[i].phoneme]);
> >>>             Word word;
> >>>             word.CreateFromString(Input,factor_order,wd,false);
> >>>             scores.push_back(score);
> >>>             pair<Word,vector<float> > linkdata(word,scores);
> >>>             col.push_back(linkdata);
> >>>         }
> >>>     }
> >>>
> >>>     return col;
> >>> }
> >>>
> >>> /** Creates a confusion network from a NULL terminated phonemes list
> >>> and a phonemes confusion matrix */
> >>> static MyConfusionNet * phonemes_to_cn(confusion_matrix_t * cm,const
> >>> char ** phonemes, int width, double thresh, const vector<FactorType>
> >>> &factor_order) {
> >>>     debug("start");
> >>>
> >>>     MyConfusionNet * cn = new MyConfusionNet();
> >>>
> >>>     int i = 0;
> >>>     while(phonemes[i]!=NULL) {
> >>>         debug("%s",phonemes[i]);
> >>>         MyConfusionNet::Column col =
> >>> create_phoneme_col(cm,phonemes[i],width,thresh,factor_order);
> >>>         cn->addCol(col);
> >>>         i += 1;
> >>>     }
> >>>
> >>>     return cn;
> >>> }
> >>>
> >>> So, if anyone has an idea about what's wrong here.... thanks!
> >>>
> >>> cheers,
> 
 
--
Barry Haddow
University of Edinburgh
+44 (0) 131 651 3173

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to