Hi, the tokenizer / detokenizer are indeed not fully able to reverse to the original string. It is possible to write such a tokenizer (not easy), but the one that ships with Moses does not do the job.
-phi On Mon, Jul 14, 2014 at 11:53 AM, Judah Schvimer <[email protected]> wrote: > Hi, > > When I'm using the decoder I have to tokenize my target sentences before I > translate them. However, when I detokenize them it leaves awkward spaces > around what was tokenized. is there any way to fix this? It seems to be > mainly around slashes and colons > > Source: :doc:`/tutorial/aggregation-zip-code-data-set` > Target: : Doc: '/ tutorial / aggregation-zip-code-data-set' > > Thanks, > Judah > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
