Hi Wei

You can "protect" certain patterns from the tokeniser using the -protected <FILENAME> switch. The filename should contain regular expressions which signal which terms you want to protect. This could be used to prevent the tokeniser from breaking up xml tags - I have used it for URLs.

cheers - Barry

On 24/05/14 18:16, Wei Qiu wrote:
Hi,

Is it also reasonable to use xml markup for tuning?

How can I use xml markup in ems? I am asking because it seems that the tokenize step would break the xml tags into tokens.

Thanks in advance.

Best,
Wei


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to