there was a recent checkin for Korean
https://github.com/moses-smt/mosesdecoder/commit/194964c017d8acb56918bab94f4d7cdd60b9c9b7
Maybe there are also some Korean or Asian-specific tools out there
On 17/01/18 01:01, daideqi wrote:
Dear Moses-Support,
Some colleagues and I, who are all new to SMT and Moses, have some
Korean parallel corpora that we want to use to train Moses.
My question is how do we go about preparing/tokenizing the data, and
can you recommend any specific tools? I searched the moses-support
archive and the Interwebs and couldn't find any specific
recommendations or step-by-step instructions for newbs like us.
We'd be very grateful if you could point us in the right direction.
Thanks in advance!
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Hieu Hoang
http://moses-smt.org/
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support