Hi everyone, I built a toy hiero system for German-English (trained on 50'000 sentences) and used sigtest filtering (code in contrib/sigtest-filter) to prune the rule-table. I get a BLEU of 16.21 on the wmt 2009 test set. For comparison, I built the same system using a script that prunes the rule-table by taking the 30-best rules sorted by p(e|f), lex(e|f), p(f|e), lex(f|e). For this system, I get a BLEU of 16.63 for the same system. Did someone observe a decrease in performance for hiero systems using sigtest filtering? Or could this maybe be due to the small size of the data set I used?
Thanks at lot! Cheers, Fabienne
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
