there shouldn't be any problems with diacritics. The decoder itself is encoding-neutral, however, all the scripts assumes UTF8 so it's best to encode your training data and input as utf8.
ps. please send questions to the moses mailing list rather than to people directly so everyone can get to answer On 6 June 2014 11:41, Andreas Dolinsek <[email protected]> wrote: > Thank you for your fast response. Escaping these kind of chars seems > logically. What I meant especially is escaping german diacritics. > > e.g. I want to translate the german sentence: „Das Wetter ist schön“ > > echo "Das Wetter ist schön" | moses -f <config-file> -v 0 -threads all > > Die Terminal does not detect „ö“. Which encoding does moses need for > non-ASCII chars. > > > Thanks! > -- > Andreas Dolinsek > http://www.sonicomobile.com > > Am 06. Juni 2014 bei 12:07:24, Hieu Hoang ([email protected]) schrieb: > > you have to escape certain characters. This is provided by the script > scripts/tokenizer/escape-special-chars.perl > > You should also remove non-printing characters with the script > scripts/tokenizer/remove-non-printing-char.perl > > of course, tokenize your data using the same method you used to tokenize > the training data > > > > On 6 June 2014 10:48, Andreas Dolinsek <[email protected]> wrote: > >> Dear Hieu Hoang, >> >> my name is Andreas and I work for a company called Sonico Mobile, which >> ships translation apps (iTranslate) to millions of people with over 4 >> million translation requests per day. >> Check out our website if you are interested http://www.itranslateapp.com >> . >> >> I installed Moses on a highly scaleable platform and wrote an API for it >> (for testing), because we would like to try it out for some use cases. I >> trained it with the europarl parallel corpus. >> >> I have a really basic problem now, I don’t know how to translate special >> chars. I call via my python tornado server: >> >> echo <Text> | moses -f <config-file> -v 0 -threads all >> >> Non-ASCII chars result in an error. How is it possible to feed it with >> special chars? >> >> Thank you very much! >> >> PS: If everything works out, of course we will provide an API access :-) >> -- >> Andreas Dolinsek >> http://www.sonicomobile.com >> > > > > -- > Hieu Hoang > Research Associate > University of Edinburgh > http://www.hoang.co.uk/hieu > > -- Hieu Hoang Research Associate University of Edinburgh http://www.hoang.co.uk/hieu
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
