Please subscribe to the Moses mailing list before posting to it. You can subscribe here: http://mailman.mit.edu/mailman/listinfo/moses-support To answer your question, each language supported by the tokenizer has it's own file in scripts/share/nonbreaking_prefixes There is currently no file for Hindi. If you create 1, please consider sharing it with everyone
Hieu Hoang http://www.hoang.co.uk/hieu ---------- Forwarded message ---------- From: <[email protected]> Date: 13 March 2016 at 18:13 Subject: Moses-support post from [email protected] requires approval To: [email protected] As list administrator, your authorization is requested for the following mailing list posting: List: [email protected] From: [email protected] Subject: Regarding moses Reason: Post by non-member to a members-only list At your convenience, visit: http://mailman.mit.edu/mailman/admindb/moses-support to approve or deny the request. ---------- Forwarded message ---------- From: Parul gupta <[email protected]> To: [email protected] Cc: Date: Sun, 13 Mar 2016 23:43:45 +0530 Subject: Regarding moses Hello sir, I'm working on moses. I'm getting problem in hindi tokenization. For mosesdecoder it's showing no abbreviations for 'hi'. How can i tokenize hindi ? Thanks ! ---------- Forwarded message ---------- From: [email protected] To: Cc: Date: Subject: confirm 39584649f4a0aa41db998e82e6d0b7f74c9d70fc If you reply to this message, keeping the Subject: header intact, Mailman will discard the held message. Do this if the message is spam. If you reply to this message and include an Approved: header with the list password in it, the message will be approved for posting to the list. The Approved: header can also appear in the first line of the body of the reply.
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
