Taylor,
You can have a look at the M4Loc project http://code.google.com/p/m4loc/
We are working on pre-/post-processing scripts to preserve inline formatting
like you describe. Moses itself has the option to wrap non-translatable text
like the tags in XML
(http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc4), but this
doesn't address how to treat these tags during tokenization/recasing.

Achim 


-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of Taylor Rose
Sent: Thursday, September 08, 2011 11:30 AM
To: [email protected]
Subject: [Moses-support] Ignoring Symbols?

Hello,

I've recently started working with Moses as part of my new internship.
The company I work for uses in-house formatting tags on documents. (ie.
paragraph, bold, indent, etc.) Is there a way I can make Moses ignore
these and keep them in the correct position after translation? My first
thoughts were to somehow tell Moses that <bold> in English should
translate to <bold> in Spanish but I haven't found a way to do this if
it is even possible.

I'm still learning Moses so please hold off on the RTFMs. The website is
huge and I've only scratched the surface of the documentation. I would
appreciate any links you could provide to relevant documents.

Thanks,
-- 
Taylor Rose
Machine Translation Intern
Language Intelligence



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to