Re: new syntax available

Olivier (Grammalecte) Thu, 08 Oct 2015 01:26:03 -0700

Le 07/10/2015 19:39, Dominique Pellé a écrit :

> Perhaps Oliver R.  in CC (author of Grammalecte) can comment on
> whether there is an implicit \b at beginning and end of regexps.
> Is the format of Grammalecte rules documented?


In Grammalecte, word boundaries are explicit.

The tags [Word], [word], [Char], [char] are commands to describe the
behaviour of following rules.

[Word] and [word] mean that word boundaries will added to every regexes
of following rules. [Char] and [char] mean that no word boundaries are
added to the following rules.

[Word] and [Char] mean that rules are case insensitive.
[word] and [char] mean that rules are case sensitive.

But that’s the old way.

In the new beta of Grammalecte (0.5.0b), word boundaries are still
explicit, but it’s easier to set parameters for casing and word boundaries.

At the beginning of each rule, there is tags for parameters and options.

__[i]__  Word boundaries on both side. Case insensitive.
__<s>__  No word boundaries. Case sensitive.
__[u>__  Word boundary on left side only. Uppercase if you can.
__<i]/optname__  Word boundary on right side only. Case insensitive.
                 Rule active only if optname is True.

When rules are parsed, the parser adds automatically word boundaries if
required.

So

  [Char]
  __typo__  \betc([.][.][.]|…) -> etc. # Un seul point après « etc. »

is written now:

  __[i>/typo__ etc([.][.][.]|…) -> etc. # Un seul point après « etc. »


and

  [Word]
  __tu__  science fiction -> science-fiction   # Il manque…

is written now:

  __[i]/tu__  science fiction -> science-fiction   # Il manque…


HTH.

Regards,
Olivier


------------------------------------------------------------------------------
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Re: new syntax available

Reply via email to