> For this replacement, I need to keep the value of the number number
> along the translation, so the best option seems to add it as a factor
> ? Then, all other words of the corpus need to have an empty factor.
> It's not such an awful problem, but it seems strange.
This is what I have also been working on. I would like to train using:
I am NUM years old and have NUM cats -> Tengo NUM años, y tienen NUM gatos
NUM bottles of beer on the wall -> NUM botella de cerveza en la pared
etc.
Then I want to translate:
I have lived 42 years and have 2 dogs
preprocess it to:
I have lived NUM{1} years and have NUM{2} dogs
get back from decoding
He vivido NUM{1} años y tengo NUM{2} perros
and postprocess this to
He vivido 42 años y tengo 2 perros
So that the NUM token (ignoring the index '{#}') is used for computing
translation/reordering costs but the output gives me back the index (1, 2) so I
can replace the NUM token with the actual value in postprocessing. I need the
index value to handle multiple numbers in a phrase.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support