I need some help creating a java rule for the Khmer language in
LanguageTool. Would someone be willing to create what I believe is a simple
java rule for the Khmer language?

You shouldn't need to know Khmer to create the rule. But if you need
clarification on any point please don't hesitate to ask.

Khmer uses a zero-width spaces (U+200B) between words, but LanguageTool
interprets all types of spaces as the same, so we don't have a way to
detect different types of spaces which is needed for this rule.

So what the rule needs to accomplish is checking to make sure there is a
full space (U+0020) and not just a zero-width space (U+200B) in front of
certain words.

So the rule in plain English would be:

Make sure there is a full space (U+0020) before the words ដើម្បី, និង
and ពីព្រោះ
(so far only three words, but more will be added later – they are
conjunctions, so there won't be more than 10 or so words in the end).

Correct examples (please note there are zero-width spaces between the words
as they should be in Khmer):
គាត់​បាន​ទៅ ដើម្បី​ទិញ​ម្ហូប។

ខ្ញុំ និង​គាត់។

គាត់​ចង់​បាន ពីព្រោះ​គាត់​អត់​មាន។

Incorrect examples:

គាត់​បាន​ទៅ​ដើម្បី​ទិញ​ម្ហូប។

ខ្ញុំ​និង​គាត់។

គាត់​ចង់​បាន​ពីព្រោះ​គាត់​អត់​មាន។


 Thanks,
Nathan
------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to