Hi Wei
You can "protect" certain patterns from the tokeniser using the
-protected <FILENAME> switch. The filename should contain regular
expressions which signal which terms you want to protect. This could be
used to prevent the tokeniser from breaking up xml tags - I have used it
for URLs.
cheers - Barry
On 24/05/14 18:16, Wei Qiu wrote:
Hi,
Is it also reasonable to use xml markup for tuning?
How can I use xml markup in ems? I am asking because it seems that the
tokenize step would break the xml tags into tokens.
Thanks in advance.
Best,
Wei
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support