Might a change to the spacebefore-detection be an option?
Like specifying space type instead of just No or Yes ?

Ruud

On 23-05-13 12:58, Marcin Miłkowski wrote:
> W dniu 2013-05-23 11:32, Nathan Wells pisze:
>> So I just confirmed with some farther testing that spacebefore considers
>> a zero-width space as "spacebefore" so my rule will be false.
>>
>> Is the only way to proceed then a java rule?  Or is there some way I can
>> add an exception to "spacebefore" to be all spaces except a zero-width
>> space (U+200B)?
> No. Unfortunately, no.
>
>> The reason is in Khmer there are certain conjunctions that should always
>> have a "Real" space before them, not just a zero-width space, so I am
>> trying to create a rule to detect this.
> I'm afraid that in this particular case, a Java rule would be needed.
>
> Best,
> Marcin
>
>> Thanks,
>> Nathan
>>
>>
>> On Wed, May 22, 2013 at 10:35 PM, Nathan Wells <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>      Yes, I used the "spacebefore" detection (and I could have totally
>>      used it in the wrong way - so that could also be the problem!). But
>>      on my test it seemed that a zero-width space was included as a
>>      "space" so therefore "spacebefore" was true (Khmer words have a
>>      zero-width space between them - I am trying to detect a normal space
>>      before a word).  Am I correct that "spacebefore" will think a
>>      zero-width space is a "space" the same as a normal space?
>>
>>      Is there anyway to detect specifically a normal space, and ignore a
>>      zero-width space?
>>
>>      Thanks,
>>      Nathan
>>
>>
>>
>>
>>      On Wed, May 22, 2013 at 10:27 PM, Marcin Miłkowski
>>      <[email protected] <mailto:[email protected]>> wrote:
>>
>>          W dniu 2013-05-22 16:00, Nathan Wells pisze:
>>           > Hello Again,
>>           >
>>           > I am writing a rule trying to detect a space (U+0020) before
>>          a certain
>>           > token for Khmer. And if it is not present (or if only a
>>          zero-width space
>>           > exists U+200B) to add a space before the words.
>>           >
>>           > But it looks rules in the grammar.xml might not be able to
>>          discern the
>>           > difference between a zero-width space and a space...does that
>>          have to be
>>           > done in a java rule?
>>           >
>>
>>          No. See here:
>>
>>          http://wiki.languagetool.org/tips-and-tricks#toc13
>>
>>          Best regards,
>>          Marcin
>>
>>           > I don't really know java so I would rather keep things in the
>>           > grammar.xml for Khmer.
>>           >
>>           > I tried this, but it didn't work:
>>           >
>>           > <rule id="CONJUNCTION_SPACE" name="Add space before certain
>>          conjunctions">
>>           >              <pattern>
>>           > <marker>
>>           > <token spacebefore="no" 
>> regexp="yes">(ដើម្បី|ពីព្រោះ|ហើយនិង)</token>
>>           > </marker>
>>           >              </pattern>
>>           >              <message>Add a full space before this word.
>>           >                  <suggestion><match no="1"
>>           > regexp_match="(ដើម្បី|ពីព្រោះ|ហើយនិង)" regexp_replace="
>>          $1"></match></suggestion>
>>           >              </message>
>>           >              <short>Add a full space before this word.</short>
>>           >              <example type="correct">
>>           >                  គាត់​បាន​ទៅ​<marker> ដើម្បី</marker>​មើល។
>>           >              </example>
>>           >              <example type="incorrect" correction=" ដើម្បី">
>>           >                  គាត់​បាន​ទៅ​<marker>ដើម្បី</marker>​មើល។
>>           >              </example>
>>           >          </rule>
>>           >
>>           > Any help would be much appreciated - thanks!
>>           > Nathan
>>           >
>>           >
>>           >
>>          
>> ------------------------------------------------------------------------------
>>           > Try New Relic Now & We'll Send You this Cool Shirt
>>           > New Relic is the only SaaS-based application performance
>>          monitoring service
>>           > that delivers powerful full stack analytics. Optimize and
>>          monitor your
>>           > browser, app, & servers with just a few lines of code. Try
>>          New Relic
>>           > and get this awesome Nerd Life shirt!
>>          http://p.sf.net/sfu/newrelic_d2d_may
>>           >
>>           >
>>           >
>>           > _______________________________________________
>>           > Languagetool-devel mailing list
>>           > [email protected]
>>          <mailto:[email protected]>
>>           > https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>>           >
>>
>>
>>          
>> ------------------------------------------------------------------------------
>>          Try New Relic Now & We'll Send You this Cool Shirt
>>          New Relic is the only SaaS-based application performance
>>          monitoring service
>>          that delivers powerful full stack analytics. Optimize and
>>          monitor your
>>          browser, app, & servers with just a few lines of code. Try New Relic
>>          and get this awesome Nerd Life shirt!
>>          http://p.sf.net/sfu/newrelic_d2d_may
>>          _______________________________________________
>>          Languagetool-devel mailing list
>>          [email protected]
>>          <mailto:[email protected]>
>>          https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Try New Relic Now & We'll Send You this Cool Shirt
>> New Relic is the only SaaS-based application performance monitoring service
>> that delivers powerful full stack analytics. Optimize and monitor your
>> browser, app, & servers with just a few lines of code. Try New Relic
>> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
>>
>>
>>
>> _______________________________________________
>> Languagetool-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>>
>
> ------------------------------------------------------------------------------
> Try New Relic Now & We'll Send You this Cool Shirt
> New Relic is the only SaaS-based application performance monitoring service
> that delivers powerful full stack analytics. Optimize and monitor your
> browser, app, & servers with just a few lines of code. Try New Relic
> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
> _______________________________________________
> Languagetool-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel


------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to