Miles Osborne wrote:
> this sounds like a bug to me: recasing should just be a mapping
> from lower-case to mixed-case, with no other changes.
Some might argue it's not a bug - perhaps allowing one more chance at
very local re-ordering, based on caseful tokens, might effect a small
improvement for some data. It certainly fails the Principle of Least
Surprise, however.
> off-hand i can't remember how true casing is done in Moses; if it
> is baby translation, then forcing monotone reordering should do the
> trick.
Do you mean "-distortion-limit 0" at decode time? The actual
recasing appears to be done by a wrapper script, recase.perl.
This calls moses with "-dl 1" - changing this to 0 does indeed cause
the recaser step to only change case, on my devtest at least. Awesome!
If this is a bug, that seems to be the fix, so here's a patch (as if
that's easier than simply changing that one character :).
--- /trunk/scripts/recaser/recase.perl 2007-05-24 11:41:28.000000000
-0400
+++ recase.perl 2008-06-25 16:43:43.000000000 -0400
@@ -45,7 +45,7 @@
my $sentence = 0;
my $infile = $INFILE;
$infile =~ s/[\.\/]/_/g;
-open(MODEL,"$MOSES -v 0 -f $RECASE_MODEL -i $INFILE -dl 1|");
+open(MODEL,"$MOSES -v 0 -f $RECASE_MODEL -i $INFILE -dl 0|");
binmode(MODEL, ":utf8");
while(<MODEL>) {
chomp;
Anyway, thanks for your quick reply!
- John Burger
MITRE
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support