Re: Khmer Rule Help

2014-03-22 Thread Nathan Wells
Thanks Daniel,

I hadn't tried the new rule editor - looks nice!

-Nathan


On Sat, Mar 22, 2014 at 3:39 AM, Daniel Naber daniel.na...@languagetool.org
 wrote:

 On 2014-03-21 17:15, Nathan Wells wrote:

  Ok, I think I figured it out.
 
  Does this look right?

 As someone mentioned, the $ isn't necessary. Other than that it looks
 okay, but the tests (testrules.sh or .bat) will tell you if there's a
 problem.

 BTW, have you tried the new rule editor at
 http://community.languagetool.org/ruleEditor2 with Khmer?

 Regards
   Daniel



 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/13534_NeoTech
 ___
 Languagetool-devel mailing list
 Languagetool-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/languagetool-devel

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Inter-language Rule

2014-03-22 Thread Nathan Wells
Is there any way to create a rule that goes across languages?

I am trying to create a rule for consistent Khmer punctuation. Often users
will use English punctuation when they should use Khmer or French
punctuation and I want to correct it, but because the punctuation marks are
tagged as English (in OpenOffice for instance), I can't figure out a way
for LanguageTool to detect them.


Examples:

wrong with English colon: ដូច​នេះ:
correct with Khmer symbol: ដូច​នេះ៖

wrong with English quotes: តើ​អ្នក​ចង់​ទៅ?
correct with French Guillemets (though they are tagged as English in
OpenOffice): «តើ​អ្នក​ចង់​ទៅ?»

Any ideas?

Thanks for your time!
Nathan
--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: docbook

2014-03-22 Thread Marcin Miłkowski
W dniu 2014-03-22 13:31, Dave Pawson pisze:
 On 22 March 2014 11:56, Marcin Miłkowski list-addr...@wp.pl wrote:

 And just to come back to your docbook question: I think it should be
 fairly easy to create a simple parser that would use AnnotatedText to
 check docbook format. I don't know whether there are any attributes that
 contain text content in docbook; if not, then writing a parser should be
 really easy. We could then include it in the next release of LT.

 Regards,
Marcin


 Thanks Marcin.
 fyi, there seems to be no means to grammar check docbook xml
 and I know many 'book length' texts are written in Docbook.

 There is no 'content' information in attributes -and anyway
 Relax NG validation can check that. It is just the XML content
 that needs checking.

Than we could simply create a very simplistic parser that forwards all 
textual content of all elements to LT and annotates everything else as 
non-text. The only trick is that Java XML parsers wouldn't allow us to 
see entities, raw encoding etc., so we might get mismatch for character 
positions in that cases. I'd need to see how this is solved in Okapi 
toolkit where raw XML is prepared for translation in XLIFF.

Are there xml:lang attributes on docbook elements? We could use them to 
set LT to use proper language. This is a bit more complex but could work.

Regards,
Marcin



--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


New grammar tool.

2014-03-22 Thread Dave Pawson
Looking at an English language book I have
bear vs bare.
!-- this is an example rule: --
rule id=CONFUSION_OF_BARE_BEAR name=confusion of bare/bear
pattern
tokenbare/token

/pattern
messageDid you mean suggestionbare/suggestion?/message
example type=incorrectYou have markerbear/marker feet./example
example type=correctYou have bare feet./example
/rule

I'd like to say tokenbare/token
followed by any noun?

I'm getting an error

There are problems with your rule:

The rule did not find the expected error in 'You have bear feet.'
The sentence was analyzed like this:
S You[you/PRP,B-NP-singular|E-NP-singular] have[have/VB,B-VP]
bear[bear/NN:UN,bear/NNS,B-NP-plural]
feet[foot/NNS,E-NP-plural].[./.,/S,O]
The rule found an unexpected error in 'You have bare feet.'

Suggestions please


TiA


-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Semantics of rules?

2014-03-22 Thread Dave Pawson
Schematron[1] is a tool to allow subtle checking of XML content.
My previous example I now find is wrong semantically?

I *think* that

rule id=CONFUSION_OF_bare_bear name=confusion of bare/bear
pattern
tokenbear/token
/pattern
messageDid you mean suggestionbare/suggestion feet?/message
example type=incorrectSorry for my markerbear/marker feet./example
example type=correctSorry for my bare feet./example
/rule

is correct. One Schematron check which could be done
is to ensure that /rule/pattern/token = /example[@type='incorrect']/marker

Just to check that the examples are the right way round?
Would this be helpful?

[1] http://www.schematron.com/

-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: New grammar tool.

2014-03-22 Thread Daniel Naber
On 2014-03-22 14:35, Dave Pawson wrote:

 I'd like to say tokenbare/token
 followed by any noun?

I guess you're still using the old editor, as the new one isn't linked 
yet. Please try the new one at 
http://community.languagetool.org/ruleEditor2/index?lang=en and let us 
know if it works for you.

Regards
  Daniel


--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: New grammar tool.

2014-03-22 Thread Dave Pawson
On 22 March 2014 14:05, Daniel Naber daniel.na...@languagetool.org wrote:

 I guess you're still using the old editor, as the new one isn't linked
 yet. Please try the new one at
 http://community.languagetool.org/ruleEditor2/index?lang=en and let us
 know if it works for you.


Initial reaction? Scary. I'm not a grammarian.
It is intimidating where the XML wasn't (for me).
Who is it for? Any help available? Any less scary
version available?

regards




-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: New grammar tool.

2014-03-22 Thread Marcin Miłkowski
W dniu 2014-03-22 14:35, Dave Pawson pisze:
 Looking at an English language book I have
 bear vs bare.
 !-- this is an example rule: --
 rule id=CONFUSION_OF_BARE_BEAR name=confusion of bare/bear
  pattern
  tokenbare/token

  /pattern
  messageDid you mean suggestionbare/suggestion?/message
  example type=incorrectYou have markerbear/marker feet./example
  example type=correctYou have bare feet./example
 /rule

 I'd like to say tokenbare/token
 followed by any noun?

tokenbare/token
token postag=NN.* postag_regexp=yes/

(note this might be slightly unsafe as we don't have a strong 
disambiguator so some words tagged as nouns could be verbs or adjectives).


 I'm getting an error


Well, you said you expect bare but your incorrect example has bear. 
No wonder you get no match.



 There are problems with your rule:

 The rule did not find the expected error in 'You have bear feet.'
 The sentence was analyzed like this:
 S You[you/PRP,B-NP-singular|E-NP-singular] have[have/VB,B-VP]
 bear[bear/NN:UN,bear/NNS,B-NP-plural]
 feet[foot/NNS,E-NP-plural].[./.,/S,O]
 The rule found an unexpected error in 'You have bare feet.'

 Suggestions please

Try to use our new rule editor here, as it will also show unexpected 
matches.

Best,
MM



 TiA




--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Semantics of rules?

2014-03-22 Thread Marcin Miłkowski
W dniu 2014-03-22 14:45, Dave Pawson pisze:
 Schematron[1] is a tool to allow subtle checking of XML content.
 My previous example I now find is wrong semantically?

 I *think* that

 rule id=CONFUSION_OF_bare_bear name=confusion of bare/bear
  pattern
  tokenbear/token
  /pattern
  messageDid you mean suggestionbare/suggestion feet?/message
  example type=incorrectSorry for my markerbear/marker 
 feet./example
  example type=correctSorry for my bare feet./example
 /rule

 is correct. One Schematron check which could be done
 is to ensure that /rule/pattern/token = /example[@type='incorrect']/marker

 Just to check that the examples are the right way round?
 Would this be helpful?

 [1] http://www.schematron.com/

Well, we already have this check in place in our JUnit tests. I'm not 
sure if adding a single Schematron check just to check what is already 
checked is not really overkill. Sure, it's not a check in XML but on the 
level of our rule tests but the rules have to be tested anyway.

Regards,
Marcin

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: New grammar tool.

2014-03-22 Thread Daniel Naber
On 2014-03-22 15:39, Dave Pawson wrote:

 Initial reaction? Scary. I'm not a grammarian.
 It is intimidating where the XML wasn't (for me).
 Who is it for?

It's for the 99% of people who have never edited an XML file.

 Any help available? Any less scary
 version available?

A new version is online. It includes more help text, some usability 
fixes and a quick help for regular expressions. To further improve it, I 
need more detailed feedback.

Regards
  Daniel


--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: prototype of new rule editor

2014-03-22 Thread Kumara Bhikkhu
I love it! Amazing work.

Minor issue: Is the A example sentence in the 2 text boxes 
deliberately wrong?

kb

Daniel Naber wrote thus at 12:51 AM 18-03-14:
Hi,

there's now a prototype of a new rule editor available at
http://community.languagetool.org/ruleEditor2/. Main features are:

* Checks the example sentence against known errors so nobody wastes time
writing a rule that already exists

* Has text analysis (POS tags, lemmas, chunks) integrated

* Checks rule against a part of the Wikipedia/Tatoeba corpus to help
avoid false alarms

The basic workflow idea is to start with two example sentences, a wrong
one and its corrected version. A (trivial) pattern is then generated
automatically, which is just the word(s) that differ in the wrong and
corrected sentence. The user then needs to add more tokens to make the
rule complete. Finally, it is checked against Wikipedia/Tatoeba.

Several things are not supported yes, but please give it a try anyway.

Regards
   Daniel


--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel