Hi all
I'm using Moses API for decoding a confusion network. The CN is
created from the output of an ASR engine and a confusion matrix. More
precisely (even though it's probably irrelevant to my problem), the ASR
engine provides a string of phonemes (1-best) and the confusion matrix
provides alternatives for each phonemes (the idea was described in Jiang
et al., _Phonetic representation based speech translation_, MT Summit
XIII, 2011).
When the CN is dumped into a file and I use
moses -f moses.phonemes.cn.ini < CN
to decode it, everything is fine.
But when I use Moses API (loading the same configuration file), I get
incomplete translations, like:
ASR output (French): "nous font sont toujours chimistes plume
rassembleront ch je trouve que le office de ce tout de suite"
Phonetic representation: "n u f on s on t t u ge u r ch i m i s t z p l
y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d
swa s swa t u d s h i t"
Translation: "of"
score: 903011968.000000
Note that the transcription is poor (I haven't really tuned the ASR
engine), but still, the translation ought to be more than just "of".
Sometimes it's several words, I guess it's a phrase in the phrase table.
The word generally seems to be the translation of a word in the source
sentence.
When I use moses on command line to translate either the 1-best or the
the CN, I get a reasonable translation. When I use the API to translate
the 1-best phonetic representation, I also get a reasonable translation.
I think the CN object is created correctly because moses loads it and
prints it prior to decoding (this is normal verbose behavior). I also
tried to create a PCN object, and got exactly the same results. So I
guess the problem is either how I tell moses to decode it or how I
extract the result from the Hypothesis object. But I'm clueless about
what's the problem is here, since the code is working when I just
translate a string. The translation score seems ridiculously high too.
I'll give below the corresponding code.
Decoding and hypothesis extraction:
***********************************
[...]
Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
&system);
manager->ProcessSentence();
const Hypothesis* hypo = manager->GetBestHypothesis();
string hyp = moses_get_hyp(hypo);
[...]
pair->translation_score = UntransformScore(hypo->GetScore());
[...]
string moses_get_hyp(const Hypothesis* hypo) {
return hypo->GetTargetPhraseStringRep();
}
Creation of the CN:
*******************
/** new class derived from ConfusionNet, with a new method for directly
creating CN */
class MyConfusionNet : public ConfusionNet {
public:
void addCol(Column);
};
void MyConfusionNet::addCol(Column col) {
data.push_back(col);
}
/** create a column of the CN */
static MyConfusionNet::Column create_phoneme_col(confusion_matrix_t *
cm, const char * ph, int width, double thresh, const vector<FactorType>
&factor_order) {
MyConfusionNet::Column col;
phoneme_conf_t * ph_conf =
(phoneme_conf_t*)g_hash_table_lookup(cm->matrix,ph);
if(ph_conf==NULL) {
return col;
}
int i;
for(i = 0; i<cm->n_phonemes; i++) {
vector<float> scores;
float score = float(ph_conf[i].p);
if((width<=0 || i<width) && (thresh<=0 || score>=thresh)) {
string wd(cm->phonemes[ph_conf[i].phoneme]);
Word word;
word.CreateFromString(Input,factor_order,wd,false);
scores.push_back(score);
pair<Word,vector<float> > linkdata(word,scores);
col.push_back(linkdata);
}
}
return col;
}
/** Creates a confusion network from a NULL terminated phonemes list and
a phonemes confusion matrix */
static MyConfusionNet * phonemes_to_cn(confusion_matrix_t * cm,const
char ** phonemes, int width, double thresh, const vector<FactorType>
&factor_order) {
debug("start");
MyConfusionNet * cn = new MyConfusionNet();
int i = 0;
while(phonemes[i]!=NULL) {
debug("%s",phonemes[i]);
MyConfusionNet::Column col =
create_phoneme_col(cm,phonemes[i],width,thresh,factor_order);
cn->addCol(col);
i += 1;
}
return cn;
}
So, if anyone has an idea about what's wrong here.... thanks!
cheers,
--
Sylvain Raybaud
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support