Marcin Miłkowski wrote:
> > In Breton at least, I experience sometimes a combinatorial
> > explosion or rules in order to implement what I want. Of
> > course many rules probably slow down.
>
> I don't think that having extra 16 rules changes much as they usually
> don't match anyway. The slowdown cannot be due to this thing.
Yes, it won't provide a big speed up. It's more a matter
of making it easier to maintain rules.
> > Another construct which would help to avoid explosion of
> > number of rules is a way to be able to perform several
> > substitutions. Here is an example in Breton:
> >
> > <rule id="DAM" name="da + ma = da’m">
> > <pattern>
> > <token>da</token>
> > <token>ma</token>
> > </pattern>
> > <message>Gwelloc’h eo skrivañ
> > <suggestion>\1’m</suggestion>.</message>
> > <example type="incorrect">Lavaret em eus <marker>da ma</marker>
> > zad.</example>
> > <example type="correct">Lavaret em eus da’m zad.</example>
> > </rule>
> >
> > <rule id="DAZ" name="da + da = da’z">
> > <pattern>
> > <token>da</token>
> > <token>da</token>
> > </pattern>
> > <message>Gwelloc’h eo skrivañ
> > <suggestion>\1’z</suggestion>.</message>
> > <example type="incorrect">Lavaret em eus <marker>da da</marker>
> > dad.</example>
> > <example type="correct">Lavaret em eus da’z tad.</example>
> > </rule>
> >
> > Those 2 rules are almost the same. I wish I could write them in
> > one single rule with the pattern....
>
> Right. You think of conditional search replace (if ma, then ’m; if da,
> then ’z). If it were ma -> ’m and da -> ’d, then you could simply
> replace to ’$1, where you'd match ([dm]) in the regexp. Now, since you
> have ’z as the second replacement, you can try another trick. Simply
> make two <match> elements: first for "ma", second for "da", and make
> sure they are exclusive. One will produce an empty string, and another
> the string you want. I did not test it, but the idea is simple enough.
>
> The only caveat is that I don't remember what <match> does by default if
> it produces an empty string via substitution. For some time, it did
> produce the original string in parentheses, but we can change it easily
> if it still does (I remember I changed some of this because of
> spell-checking).
I remember now trying exactly that a long time ago, hoping it would work
but it does not because when a <match> element does not match, it
unfortunately outputs the token unchanged, rather than output and
empty string which would be more useful in this case. I'm not sure
that behavior can be changed without introducing backward compatibilities
issues.
I remember discussing this "issue" in the mailing list a long time ago.
I proposed a small patch to improve it which added the optional
regexp_replace_nomatch="..." attribute to the <match> tag.
Ha! I found that discussion in the mailing list achive (2011-10-16):
http://sourceforge.net/mailarchive/forum.php?thread_name=CAON-T_gEtK3NYzi4LHDJoyANQ5R6GMmMeCRxVzTi%3DZ-ctaFk-g%40mail.gmail.com&forum_name=languagetool-devel
With that proposal, I could write a single rule:
<rule id="DAZ" name="da + da = da’z, da ma = da'm">
<pattern>
<token>da</token>
<token regexp="yes">[dm]a</token>
</pattern>
<message>Gwelloc’h eo skrivañ <suggestion>\1’<match no="1"
regexp_match="da" regexp_replace="z" regexp_replace_nomatch=""/><match
no="1" regexp_match="ma" regexp_replace="m" regexp_replace_nomatch=""/>
</suggestion>.</message>
<example type="incorrect">Lavaret em eus <marker>da ma</marker>
zad.</example>
<example type="correct">Lavaret em eus da’m zad.</example>
<example type="incorrect">Lavaret em eus <marker>da da</marker>
dad.</example>
<example type="correct">Lavaret em eus da’z tad.</example></rule>
</rule>
You replied privately to me at the time (not in the mailing list) indicating
that my proposal may be a little hacky solution, and a somewhat more
general mechanism.
What this general mechanism would be is unclear to me.
Regards
-- Dominique
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel