Hi, I do not think that the detokenizer would cause conversion of ' to ". You can check the raw output of the decoder, and see how it is changed by the detokenizer.
-phi On Wed, Mar 9, 2016 at 11:44 AM, Vincent Nguyen <vngu...@neuf.fr> wrote: > Hi, > > I got the following situation: > > This group age > is translated sometimes in: > ce groupe d'âge (correct) > ce groupe d" âge (incorrect) > ce groupe d "âge (incorrect) > > I am wondering if this is more a detokenizer issue or a corpus issue, or > both. > > Technically in French, there shouldn't be any space before or after the > apostrophe. > In the Europarl Corpus, as well as in the News2014 one, there are some > instances with a space before or after. > > Then I have the feeling that the decoder gets a ' with surrounding > spaces leading to the detokenizer to transform into " > > Anyone with a similar issue ? > > thanks. > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support >
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support