Daniel Naber <daniel.na...@languagetool.org> wrote: > Hi, > > currently, the LT rules written in XML are language-specific. Is there > any reason for this limitation? There are some rules that could be used > for all languages, e.g. misspellings of names, like "Linux Torvalds". > > Here's an idea how we could implement that: > > -Create a new Maven project languagetool-language-modules/global that > has a grammar.xml file where the language-independent rules are stored. > Rules could look like this: > > <rule ...> > <pattern> > <token>Linux</token> > <token>Torvalds</token> > </pattern> > <message>i18n:misspelled_name</message> > <suggestion>Linus Torvalds</suggestion> > ... > </rule> > > 'misspelled_name' is a key in the existing translation file, so that the > message can be translated at Transifex. Maybe if there's no translation, > the rule shouldn't become active? > > -Change the dependencies so that every language depends on this new > module > > -Adapt the Java code to load the rules from the new file, additionally > to the existing rules > > Any ideas or comments? > > Regards > Daniel
Half the rule content would have to be customized for each language: the <message>, the <url>, and <example>s. And sometimes, some languages may need to add specific exceptions in the pattern. So I'm not sure whether it's worth adding a feature for this. Having said that it would be good to know rules in some languages which could be useful in other languages. Regarding your example, the French grammar already has this rule by the way: <rule> <pattern> <token>Linux</token> <token regexp="yes">Th?orvald?s?</token> </pattern> <message>Écrivez <suggestion>Linus Torvalds</suggestion> s’il s’agit du créateur de Linux.</message> <url>https://fr.wikipedia.org/wiki/Linus_Torvalds</url> <example type="incorrect"><marker>Linux Torvalds</marker></example> <example type="correct">Linus Torvalds</example> </rule> <rule> <pattern> <token>Linus</token> <token regexp="yes">Th?orvald?s?<exception>Torvalds</exception></token> </pattern> <message>Écrivez <suggestion>Linus Torvalds</suggestion> s’il s’agit du créateur de Linux.</message> <url>https://fr.wikipedia.org/wiki/Linus_Torvalds</url> <example type="incorrect"><marker>Linus Thorvalds</marker></example> <example type="correct">Linus Torvalds</example> </rule> There are many other names often misspelled that are already in the French grammar rules that could be used in other languages. Some examples: - Jimmy Hendrix -> Jimi Hendrix - Forest Gump -> Forrest Gump - Axle Rose -> Axl Rose - Megadeath -> Megadeth - etc. Feel free to copy them from French rule NOM_MAL_EPELE into other languages. Regards Dominique ------------------------------------------------------------------------------ WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual feedback on key security issues and trends. Skip the complicated setup - simply import a virtual appliance and go from zero to informed in seconds. http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel