W dniu 2013-05-23 11:32, Nathan Wells pisze: > So I just confirmed with some farther testing that spacebefore considers > a zero-width space as "spacebefore" so my rule will be false. > > Is the only way to proceed then a java rule? Or is there some way I can > add an exception to "spacebefore" to be all spaces except a zero-width > space (U+200B)?
No. Unfortunately, no. > > The reason is in Khmer there are certain conjunctions that should always > have a "Real" space before them, not just a zero-width space, so I am > trying to create a rule to detect this. I'm afraid that in this particular case, a Java rule would be needed. Best, Marcin > > Thanks, > Nathan > > > On Wed, May 22, 2013 at 10:35 PM, Nathan Wells <[email protected] > <mailto:[email protected]>> wrote: > > Yes, I used the "spacebefore" detection (and I could have totally > used it in the wrong way - so that could also be the problem!). But > on my test it seemed that a zero-width space was included as a > "space" so therefore "spacebefore" was true (Khmer words have a > zero-width space between them - I am trying to detect a normal space > before a word). Am I correct that "spacebefore" will think a > zero-width space is a "space" the same as a normal space? > > Is there anyway to detect specifically a normal space, and ignore a > zero-width space? > > Thanks, > Nathan > > > > > On Wed, May 22, 2013 at 10:27 PM, Marcin Miłkowski > <[email protected] <mailto:[email protected]>> wrote: > > W dniu 2013-05-22 16:00, Nathan Wells pisze: > > Hello Again, > > > > I am writing a rule trying to detect a space (U+0020) before > a certain > > token for Khmer. And if it is not present (or if only a > zero-width space > > exists U+200B) to add a space before the words. > > > > But it looks rules in the grammar.xml might not be able to > discern the > > difference between a zero-width space and a space...does that > have to be > > done in a java rule? > > > > No. See here: > > http://wiki.languagetool.org/tips-and-tricks#toc13 > > Best regards, > Marcin > > > I don't really know java so I would rather keep things in the > > grammar.xml for Khmer. > > > > I tried this, but it didn't work: > > > > <rule id="CONJUNCTION_SPACE" name="Add space before certain > conjunctions"> > > <pattern> > > <marker> > > <token spacebefore="no" > regexp="yes">(ដើម្បី|ពីព្រោះ|ហើយនិង)</token> > > </marker> > > </pattern> > > <message>Add a full space before this word. > > <suggestion><match no="1" > > regexp_match="(ដើម្បី|ពីព្រោះ|ហើយនិង)" regexp_replace=" > $1"></match></suggestion> > > </message> > > <short>Add a full space before this word.</short> > > <example type="correct"> > > គាត់បានទៅ<marker> ដើម្បី</marker>មើល។ > > </example> > > <example type="incorrect" correction=" ដើម្បី"> > > គាត់បានទៅ<marker>ដើម្បី</marker>មើល។ > > </example> > > </rule> > > > > Any help would be much appreciated - thanks! > > Nathan > > > > > > > > ------------------------------------------------------------------------------ > > Try New Relic Now & We'll Send You this Cool Shirt > > New Relic is the only SaaS-based application performance > monitoring service > > that delivers powerful full stack analytics. Optimize and > monitor your > > browser, app, & servers with just a few lines of code. Try > New Relic > > and get this awesome Nerd Life shirt! > http://p.sf.net/sfu/newrelic_d2d_may > > > > > > > > _______________________________________________ > > Languagetool-devel mailing list > > [email protected] > <mailto:[email protected]> > > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > > > > > > ------------------------------------------------------------------------------ > Try New Relic Now & We'll Send You this Cool Shirt > New Relic is the only SaaS-based application performance > monitoring service > that delivers powerful full stack analytics. Optimize and > monitor your > browser, app, & servers with just a few lines of code. Try New Relic > and get this awesome Nerd Life shirt! > http://p.sf.net/sfu/newrelic_d2d_may > _______________________________________________ > Languagetool-devel mailing list > [email protected] > <mailto:[email protected]> > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > > > > > > ------------------------------------------------------------------------------ > Try New Relic Now & We'll Send You this Cool Shirt > New Relic is the only SaaS-based application performance monitoring service > that delivers powerful full stack analytics. Optimize and monitor your > browser, app, & servers with just a few lines of code. Try New Relic > and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may > > > > _______________________________________________ > Languagetool-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may _______________________________________________ Languagetool-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/languagetool-devel
