> For this replacement, I need to keep the value of the number number
> along the translation, so the best option seems to add it as a factor
> ? Then, all other words of the corpus need to have an empty factor.
> It's not such an awful problem, but it seems strange.

This is what I have also been working on. I would like to train using:

I am NUM years old and have NUM cats   ->  Tengo NUM años, y tienen NUM gatos
NUM bottles of beer on the wall  ->   NUM botella de cerveza en la pared
etc.

Then I want to translate:

I have lived 42 years and have 2 dogs

preprocess it to:

I have lived NUM{1} years and have NUM{2} dogs

get back from decoding

He vivido NUM{1} años y tengo NUM{2} perros

and postprocess this to

He vivido 42 años y tengo 2 perros

So that the NUM token (ignoring the index '{#}') is used for computing
translation/reordering costs but the output gives me back the index (1, 2) so I
can replace the NUM token with the actual value in postprocessing.  I need the
index value to handle multiple numbers in a phrase.


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to