Jaume Ortolà i Font <jaumeort...@gmail.com> wrote: 2015-09-05 16:11 GMT+02:00 Daniel Naber <daniel.na...@languagetool.org>: > >> On 2015-09-04 23:21, Dominique Pellé wrote: >> >> > I wish I could write a rule pattern like this: >> > >> > <tokens>plein temps#chaque fois#rude épreuve#vol >> > d’oiseau</tokens> >> >> What about a more radical approach (which would be trickier to >> implement): >> >> <token>a</token> >> <regex>plein temps|chaque fois|rude épreuve|vol d’oiseau</regex> >> > > Or even more general. Some times I wish I could write rules with regular > expressions ignoring completely the tokenizaton, taking the whole sentence > as a string. > > In the case of Dominique's rule it would be something like: > > search: a (plein temps|chaque fois|rude épreuve|vol d’oiseau) > and suggest replacing with: à $1 >
Yes, I was also thinking about that, but I did not dare proposing it :-) My concern was that tokenization is needed for performances, but maybe that's not true. It is similar to what Daniel wrote earlier as well: <regex>a (plein temps|chaque fois|rude épreuve|vol d’oiseau)</regex> It would make some such rules a lot simpler to write and more concise. Matching without tokenization is what Lightproof and Grammalecte do. Regards Dominique
------------------------------------------------------------------------------
_______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel