Yes, Cyrine. You need to prepare the language model corpus the same as the target half of your parallel corpus.

On 06/20/2014 04:45 PM, Cyrine NASRI wrote:
Thank you Tom for your reply.
So i keep these """"" &co in the language model too?

Bests


2014-06-18 16:04 GMT+02:00 Cyrine NASRI <[email protected] <mailto:[email protected]>>:

    Hello
    I have concern the tonkenizer script,

    When i so the tokenization, i got some "&quot;" and "&apos;"..
    wHen i let them in the training process i think it damage the
    translation quality?
    So should i really let them or transform them to " and '  after
    training.

    Thank you in advance for your reply

    Best Cyrine




_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to