Hello Again,

I am writing a rule trying to detect a space (U+0020) before a certain
token for Khmer. And if it is not present (or if only a zero-width space
exists U+200B) to add a space before the words.

But it looks rules in the grammar.xml might not be able to discern the
difference between a zero-width space and a space...does that have to be
done in a java rule?

I don't really know java so I would rather keep things in the grammar.xml
for Khmer.

I tried this, but it didn't work:

<rule id="CONJUNCTION_SPACE" name="Add space before certain conjunctions">
            <pattern>
<marker>
<token spacebefore="no" regexp="yes">(ដើម្បី|ពីព្រោះ|ហើយនិង)</token>
</marker>
            </pattern>
            <message>Add a full space before this word.
                <suggestion><match no="1"
regexp_match="(ដើម្បី|ពីព្រោះ|ហើយនិង)" regexp_replace="
$1"></match></suggestion>
            </message>
            <short>Add a full space before this word.</short>
            <example type="correct">
                គាត់​បាន​ទៅ​<marker> ដើម្បី</marker>​មើល។
            </example>
            <example type="incorrect" correction=" ដើម្បី">
                គាត់​បាន​ទៅ​<marker>ដើម្បី</marker>​មើល។
            </example>
        </rule>

Any help would be much appreciated - thanks!
Nathan
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to