Re: Khmer Rule Help
Thanks Daniel, I hadn't tried the new rule editor - looks nice! -Nathan On Sat, Mar 22, 2014 at 3:39 AM, Daniel Naber daniel.na...@languagetool.org wrote: On 2014-03-21 17:15, Nathan Wells wrote: Ok, I think I figured it out. Does this look right? As someone mentioned, the $ isn't necessary. Other than that it looks okay, but the tests (testrules.sh or .bat) will tell you if there's a problem. BTW, have you tried the new rule editor at http://community.languagetool.org/ruleEditor2 with Khmer? Regards Daniel -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Inter-language Rule
Is there any way to create a rule that goes across languages? I am trying to create a rule for consistent Khmer punctuation. Often users will use English punctuation when they should use Khmer or French punctuation and I want to correct it, but because the punctuation marks are tagged as English (in OpenOffice for instance), I can't figure out a way for LanguageTool to detect them. Examples: wrong with English colon: ដូចនេះ: correct with Khmer symbol: ដូចនេះ៖ wrong with English quotes: តើអ្នកចង់ទៅ? correct with French Guillemets (though they are tagged as English in OpenOffice): «តើអ្នកចង់ទៅ?» Any ideas? Thanks for your time! Nathan -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: docbook
W dniu 2014-03-22 13:31, Dave Pawson pisze: On 22 March 2014 11:56, Marcin Miłkowski list-addr...@wp.pl wrote: And just to come back to your docbook question: I think it should be fairly easy to create a simple parser that would use AnnotatedText to check docbook format. I don't know whether there are any attributes that contain text content in docbook; if not, then writing a parser should be really easy. We could then include it in the next release of LT. Regards, Marcin Thanks Marcin. fyi, there seems to be no means to grammar check docbook xml and I know many 'book length' texts are written in Docbook. There is no 'content' information in attributes -and anyway Relax NG validation can check that. It is just the XML content that needs checking. Than we could simply create a very simplistic parser that forwards all textual content of all elements to LT and annotates everything else as non-text. The only trick is that Java XML parsers wouldn't allow us to see entities, raw encoding etc., so we might get mismatch for character positions in that cases. I'd need to see how this is solved in Okapi toolkit where raw XML is prepared for translation in XLIFF. Are there xml:lang attributes on docbook elements? We could use them to set LT to use proper language. This is a bit more complex but could work. Regards, Marcin -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
New grammar tool.
Looking at an English language book I have bear vs bare. !-- this is an example rule: -- rule id=CONFUSION_OF_BARE_BEAR name=confusion of bare/bear pattern tokenbare/token /pattern messageDid you mean suggestionbare/suggestion?/message example type=incorrectYou have markerbear/marker feet./example example type=correctYou have bare feet./example /rule I'd like to say tokenbare/token followed by any noun? I'm getting an error There are problems with your rule: The rule did not find the expected error in 'You have bear feet.' The sentence was analyzed like this: S You[you/PRP,B-NP-singular|E-NP-singular] have[have/VB,B-VP] bear[bear/NN:UN,bear/NNS,B-NP-plural] feet[foot/NNS,E-NP-plural].[./.,/S,O] The rule found an unexpected error in 'You have bare feet.' Suggestions please TiA -- Dave Pawson XSLT XSL-FO FAQ. Docbook FAQ. http://www.dpawson.co.uk -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Semantics of rules?
Schematron[1] is a tool to allow subtle checking of XML content. My previous example I now find is wrong semantically? I *think* that rule id=CONFUSION_OF_bare_bear name=confusion of bare/bear pattern tokenbear/token /pattern messageDid you mean suggestionbare/suggestion feet?/message example type=incorrectSorry for my markerbear/marker feet./example example type=correctSorry for my bare feet./example /rule is correct. One Schematron check which could be done is to ensure that /rule/pattern/token = /example[@type='incorrect']/marker Just to check that the examples are the right way round? Would this be helpful? [1] http://www.schematron.com/ -- Dave Pawson XSLT XSL-FO FAQ. Docbook FAQ. http://www.dpawson.co.uk -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: New grammar tool.
On 2014-03-22 14:35, Dave Pawson wrote: I'd like to say tokenbare/token followed by any noun? I guess you're still using the old editor, as the new one isn't linked yet. Please try the new one at http://community.languagetool.org/ruleEditor2/index?lang=en and let us know if it works for you. Regards Daniel -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: New grammar tool.
On 22 March 2014 14:05, Daniel Naber daniel.na...@languagetool.org wrote: I guess you're still using the old editor, as the new one isn't linked yet. Please try the new one at http://community.languagetool.org/ruleEditor2/index?lang=en and let us know if it works for you. Initial reaction? Scary. I'm not a grammarian. It is intimidating where the XML wasn't (for me). Who is it for? Any help available? Any less scary version available? regards -- Dave Pawson XSLT XSL-FO FAQ. Docbook FAQ. http://www.dpawson.co.uk -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: New grammar tool.
W dniu 2014-03-22 14:35, Dave Pawson pisze: Looking at an English language book I have bear vs bare. !-- this is an example rule: -- rule id=CONFUSION_OF_BARE_BEAR name=confusion of bare/bear pattern tokenbare/token /pattern messageDid you mean suggestionbare/suggestion?/message example type=incorrectYou have markerbear/marker feet./example example type=correctYou have bare feet./example /rule I'd like to say tokenbare/token followed by any noun? tokenbare/token token postag=NN.* postag_regexp=yes/ (note this might be slightly unsafe as we don't have a strong disambiguator so some words tagged as nouns could be verbs or adjectives). I'm getting an error Well, you said you expect bare but your incorrect example has bear. No wonder you get no match. There are problems with your rule: The rule did not find the expected error in 'You have bear feet.' The sentence was analyzed like this: S You[you/PRP,B-NP-singular|E-NP-singular] have[have/VB,B-VP] bear[bear/NN:UN,bear/NNS,B-NP-plural] feet[foot/NNS,E-NP-plural].[./.,/S,O] The rule found an unexpected error in 'You have bare feet.' Suggestions please Try to use our new rule editor here, as it will also show unexpected matches. Best, MM TiA -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Semantics of rules?
W dniu 2014-03-22 14:45, Dave Pawson pisze: Schematron[1] is a tool to allow subtle checking of XML content. My previous example I now find is wrong semantically? I *think* that rule id=CONFUSION_OF_bare_bear name=confusion of bare/bear pattern tokenbear/token /pattern messageDid you mean suggestionbare/suggestion feet?/message example type=incorrectSorry for my markerbear/marker feet./example example type=correctSorry for my bare feet./example /rule is correct. One Schematron check which could be done is to ensure that /rule/pattern/token = /example[@type='incorrect']/marker Just to check that the examples are the right way round? Would this be helpful? [1] http://www.schematron.com/ Well, we already have this check in place in our JUnit tests. I'm not sure if adding a single Schematron check just to check what is already checked is not really overkill. Sure, it's not a check in XML but on the level of our rule tests but the rules have to be tested anyway. Regards, Marcin -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: New grammar tool.
On 2014-03-22 15:39, Dave Pawson wrote: Initial reaction? Scary. I'm not a grammarian. It is intimidating where the XML wasn't (for me). Who is it for? It's for the 99% of people who have never edited an XML file. Any help available? Any less scary version available? A new version is online. It includes more help text, some usability fixes and a quick help for regular expressions. To further improve it, I need more detailed feedback. Regards Daniel -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: prototype of new rule editor
I love it! Amazing work. Minor issue: Is the A example sentence in the 2 text boxes deliberately wrong? kb Daniel Naber wrote thus at 12:51 AM 18-03-14: Hi, there's now a prototype of a new rule editor available at http://community.languagetool.org/ruleEditor2/. Main features are: * Checks the example sentence against known errors so nobody wastes time writing a rule that already exists * Has text analysis (POS tags, lemmas, chunks) integrated * Checks rule against a part of the Wikipedia/Tatoeba corpus to help avoid false alarms The basic workflow idea is to start with two example sentences, a wrong one and its corrected version. A (trivial) pattern is then generated automatically, which is just the word(s) that differ in the wrong and corrected sentence. The user then needs to add more tokens to make the rule complete. Finally, it is checked against Wikipedia/Tatoeba. Several things are not supported yes, but please give it a try anyway. Regards Daniel -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel