you have to escape certain characters. This is provided by the script scripts/tokenizer/escape-special-chars.perl
You should also remove non-printing characters with the script scripts/tokenizer/remove-non-printing-char.perl of course, tokenize your data using the same method you used to tokenize the training data On 6 June 2014 10:48, Andreas Dolinsek <[email protected]> wrote: > Dear Hieu Hoang, > > my name is Andreas and I work for a company called Sonico Mobile, which > ships translation apps (iTranslate) to millions of people with over 4 > million translation requests per day. > Check out our website if you are interested http://www.itranslateapp.com > . > > I installed Moses on a highly scaleable platform and wrote an API for it > (for testing), because we would like to try it out for some use cases. I > trained it with the europarl parallel corpus. > > I have a really basic problem now, I don’t know how to translate special > chars. I call via my python tornado server: > > echo <Text> | moses -f <config-file> -v 0 -threads all > > Non-ASCII chars result in an error. How is it possible to feed it with > special chars? > > Thank you very much! > > PS: If everything works out, of course we will provide an API access :-) > -- > Andreas Dolinsek > http://www.sonicomobile.com > -- Hieu Hoang Research Associate University of Edinburgh http://www.hoang.co.uk/hieu
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
