Hi Barrow

By adding
cerr << "[S2TT] GOT TRANSLATION: " << *hypo << endl;

I was able to determine that the translation that are actually generated
look reasonable. The problem therefore lays in how I extract it from the
"hypo" object. I think I'll be able to find the problem. I'll let the
list know.

thanks for the help!

cheers,

Sylvain

On 26/04/12 17:54, Barry Haddow wrote:
> Hi Sylvain
> 
> I think ProcessSentence() is the right method to call. If you look at moses 
> server then you'll see a less cluttered example of how to use the Moses api. 
> It may be your  moses_get_hyp() is not back-tracking through the hypothesis 
> correctly.
> 
> Note that you are calling  UntransformScore() which probably explains your 
> odd 
> translation score. It doesn't make much sense to do this, as you won't get a 
> probability (it's not normalised). It is unusual though, that you appear to 
> have a positive translation score (in log space).
> 
> If you increase the verbosity of moses (to 2 or 3) you'll get a better idea 
> what it is doing, and you can see whether it really is producing "of" as the 
> translation, and why.
> 
> cheers - Barry
> 
> On Thursday 26 April 2012 16:41:06 Sylvain Raybaud wrote:
>> wild guessing here: in TranslationTask::Run, I see there are many
>> alternatives for processing the sentence, like doLatticeMBR etc, not
>> just runing Manager::ProcessSentence()
>> Maybe one of these alternatives must be run for processing confusion
>> networks?
>>
>> cheers
>>
>> Sylvain
>>
>> On 26/04/12 15:53, Sylvain Raybaud wrote:
>>> Hi Barrow
>>>
>>>   Thanks for the tip, that sounds likely indeed. I'll try it again but
>>> last time I ran the software through valgrind, I got so many errors in
>>> external libs that I just gave up.
>>>
>>> In the meantime, here is the complete fonction that handles the
>>> decoding, in case someone sees something obviously wrong in here...
>>>
>>> static void moses_translate_phonemes(manager_data_t * pool,
>>> translation_pair_t * pair) {
>>>     debug("starting");
>>>
>>>     const TranslationSystem& system =
>>> StaticData::Instance().GetTranslationSystem(TranslationSystem::DEFAULT);
>>> /* there is only one translation system for now */
>>>     const StaticData &staticData = StaticData::Instance();
>>>     const vector<FactorType> &inputFactorOrder =
>>> staticData.GetInputFactorOrder();
>>>
>>>     MyConfusionNet * cn =
>>> phonemes_to_cn(pool->mp_engine->phonemes_cm,pair->source->phonemes,pool->
>>> mp_config->cn_width,pool->mp_config->cn_thresh,inputFactorOrder);
>>>
>>>     Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
>>> &system);
>>>     manager->ProcessSentence();
>>>     const Hypothesis* hypo = manager->GetBestHypothesis();
>>>
>>>     string hyp = moses_get_hyp(hypo);
>>>     char * hyp_ret = (char*)malloc((strlen(hyp.c_str())+1)*sizeof(char));
>>>     strcpy(hyp_ret,hyp.c_str());
>>>
>>>     pair->translation_score = UntransformScore(hypo->GetScore());
>>>     translation_pair_set_target(pair, hyp_ret,NULL);
>>>
>>>     delete manager;
>>>     delete cn;
>>>
>>> }
>>>
>>> cheers,
>>>
>>> Sylvain
>>>
>>> On 26/04/12 13:49, Barry Haddow wrote:
>>>> Hi Sylvain
>>>>
>>>> I'm not familiar with this part of the code, but the strange score
>>>> suggests that there's some uninitialised memory. You could try running
>>>> through valgrind and it might give some clues,
>>>>
>>>> cheers - Barry
>>>>
>>>> On Thursday 26 Apr 2012 12:24:11 Sylvain Raybaud wrote:
>>>>> Hi all
>>>>>
>>>>>   I'm using Moses API for decoding a confusion network. The CN is
>>>>> created from the output of an ASR engine and a confusion matrix. More
>>>>> precisely (even though it's probably irrelevant to my problem), the ASR
>>>>> engine provides a string of phonemes (1-best) and the confusion matrix
>>>>> provides alternatives for each phonemes (the idea was described in
>>>>> Jiang et al., _Phonetic representation based speech translation_, MT
>>>>> Summit XIII, 2011).
>>>>>
>>>>> When the CN is dumped into a file and I use
>>>>> moses -f moses.phonemes.cn.ini < CN
>>>>> to decode it, everything is fine.
>>>>>
>>>>> But when I use Moses API (loading the same configuration file), I get
>>>>> incomplete translations, like:
>>>>>
>>>>> ASR output (French): "nous font sont toujours chimistes plume
>>>>> rassembleront ch je trouve que le office de ce tout de suite"
>>>>> Phonetic representation: "n u f on s on t t u ge u r ch i m i s t z p l
>>>>> y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d
>>>>> swa s swa t u d s h i t"
>>>>> Translation: "of"
>>>>> score: 903011968.000000
>>>>>
>>>>> Note that the transcription is poor (I haven't really tuned the ASR
>>>>> engine), but still, the translation ought to be more than just "of".
>>>>> Sometimes it's several words, I guess it's a phrase in the phrase
>>>>> table. The word generally seems to be the translation of a word in the
>>>>> source sentence.
>>>>> When I use moses on command line to translate either the 1-best or the
>>>>> the CN, I get a reasonable translation. When I use the API to translate
>>>>> the 1-best phonetic representation, I also get a reasonable
>>>>> translation. I think the CN object is created correctly because moses
>>>>> loads it and prints it prior to decoding (this is normal verbose
>>>>> behavior). I also tried to create a PCN object, and got exactly the
>>>>> same results. So I guess the problem is either how I tell moses to
>>>>> decode it or how I extract the result from the Hypothesis object. But
>>>>> I'm clueless about what's the problem is here, since the code is
>>>>> working when I just translate a string. The translation score seems
>>>>> ridiculously high too. I'll give below the corresponding code.
>>>>>
>>>>> Decoding and hypothesis extraction:
>>>>> ***********************************
>>>>> [...]
>>>>> Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
>>>>> &system);
>>>>> manager->ProcessSentence();
>>>>> const Hypothesis* hypo = manager->GetBestHypothesis();
>>>>> string hyp = moses_get_hyp(hypo);
>>>>> [...]
>>>>> pair->translation_score = UntransformScore(hypo->GetScore());
>>>>> [...]
>>>>>
>>>>> string moses_get_hyp(const Hypothesis* hypo) {
>>>>>     return hypo->GetTargetPhraseStringRep();
>>>>> }
>>>>>
>>>>>
>>>>> Creation of the CN:
>>>>> *******************
>>>>>
>>>>> /** new class derived from ConfusionNet, with a new method for directly
>>>>> creating CN */
>>>>> class MyConfusionNet : public ConfusionNet {
>>>>>   public:
>>>>>     void addCol(Column);
>>>>> };
>>>>>
>>>>> void MyConfusionNet::addCol(Column col) {
>>>>>     data.push_back(col);
>>>>> }
>>>>>
>>>>> /** create a column of the CN */
>>>>> static MyConfusionNet::Column create_phoneme_col(confusion_matrix_t *
>>>>> cm, const char * ph, int width, double thresh, const vector<FactorType>
>>>>> &factor_order) {
>>>>>
>>>>>     MyConfusionNet::Column col;
>>>>>
>>>>>     phoneme_conf_t * ph_conf =
>>>>> (phoneme_conf_t*)g_hash_table_lookup(cm->matrix,ph);
>>>>>     if(ph_conf==NULL) {
>>>>>         return col;
>>>>>     }
>>>>>
>>>>>     int i;
>>>>>     for(i = 0; i<cm->n_phonemes; i++) {
>>>>>         vector<float> scores;
>>>>>         float score = float(ph_conf[i].p);
>>>>>         if((width<=0 || i<width) && (thresh<=0 || score>=thresh)) {
>>>>>             string wd(cm->phonemes[ph_conf[i].phoneme]);
>>>>>             Word word;
>>>>>             word.CreateFromString(Input,factor_order,wd,false);
>>>>>             scores.push_back(score);
>>>>>             pair<Word,vector<float> > linkdata(word,scores);
>>>>>             col.push_back(linkdata);
>>>>>         }
>>>>>     }
>>>>>
>>>>>     return col;
>>>>> }
>>>>>
>>>>> /** Creates a confusion network from a NULL terminated phonemes list
>>>>> and a phonemes confusion matrix */
>>>>> static MyConfusionNet * phonemes_to_cn(confusion_matrix_t * cm,const
>>>>> char ** phonemes, int width, double thresh, const vector<FactorType>
>>>>> &factor_order) {
>>>>>     debug("start");
>>>>>
>>>>>     MyConfusionNet * cn = new MyConfusionNet();
>>>>>
>>>>>     int i = 0;
>>>>>     while(phonemes[i]!=NULL) {
>>>>>         debug("%s",phonemes[i]);
>>>>>         MyConfusionNet::Column col =
>>>>> create_phoneme_col(cm,phonemes[i],width,thresh,factor_order);
>>>>>         cn->addCol(col);
>>>>>         i += 1;
>>>>>     }
>>>>>
>>>>>     return cn;
>>>>> }
>>>>>
>>>>> So, if anyone has an idea about what's wrong here.... thanks!
>>>>>
>>>>> cheers,
>>
>  
> --
> Barry Haddow
> University of Edinburgh
> +44 (0) 131 651 3173
> 


-- 
Sylvain Raybaud
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to