Dear all,

we just finished some experiments using placeables, and we have observed 
several issues that may be worth sharing. I don't know if someone has 
experienced the same, or you were already aware of this, but just in 
case:

(1) Special characters must be scaped in the "entity" value field. 
Otherwise, the cause XML parsing errors at tuning (not at training, 
though!), and wrong values are retrieved from the tags (e.g. we had text 
with additional quotation marks, and this caused that the translation 
stopped at the first quotation mark, not yielding the complete "entity" 
value we had encoded).

(2) <ne> tags are added to sentences as if they were computed as tokens 
during training. (i.e. not ignored, as they just contain the 
placeables).
As an example, the English sentence "Allow simple password", is 
translated as "Permitir simple contraseña <ne translation="@tag@" 
entity="&lt;/1&gt;">@tag@</ne> ."

While the first issue is our fault, we do not know what causes the 
second one. We have followed the instructions at the MOSES advanced 
features site and thus specified "extract-settings = "--Placeholder 
@tag@"" in training and "-placeholder-factor 1 -xml-input exclusive" in 
the decoder and evaluation. Has anyone experienced the same thing and/or 
know how to solve this issue?

Thank you very much. Best regards,

Carla

-- 
Carla Parra Escartín
Marie Curie Experienced Researcher - EXPERT ITN
http://expert-itn.eu/
Hermes Traducciones
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to