Hi Jorg, I had made changes to mert-moses.perl to achieve exactly what you are looking for. Please find the script attached.
To enable character level to word level transformation, you have to pass the option '--transform-decoded-file' to mert-moses.pl The script assumes that a caret token '^' has been added between words while preprocessing the corpora. So, all subwords between two carets are merged to create a single word. The changes are on line 826--834. Regards, Anoop. On Sat, Apr 22, 2017 at 5:10 PM, Jorg Tiedemann <[email protected]> wrote: > Hi, > > Is there an easy way to integrate a small script to process n-best lists > in mert-moses.perl before running mert at each iteration? An example would > be to merge character-level translations to run mert on word-level > segmentations. It’s probably rather straightforward to add an option to > specify a script for filtering but it may already exist and I just don’t > see it? > > Thanks! > Jörg > > ************************************************************ > ********************************** > Jörg Tiedemann > Department of Modern Languages http://blogs.helsinki.fi/tiedeman/ > University of Helsinki > http://blogs.helsinki.fi/language-technology/ > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- I claim to be a simple individual liable to err like any other fellow mortal. I own, however, that I have humility enough to confess my errors and to retrace my steps. http://flightsofthought.blogspot.com
mert-moses.pl
Description: Perl program
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
