RE: External rule files

2014-04-07 Thread Mike Unwalla
Thanks Dave.

 I am not an XML expert. I understand the phrase 'define a transform' to
mean 'specify a mapping'. If my understanding is not correct, please tell
me.

There is not a 1:1 mapping between the term checker postags and the LT
postags. Thus, I cannot define a transform for all the postags, but I can
define a transform for some of them. However, there are possible problems as
the examples below show.

Example 1. Ignoring technical verbs that LT does not 'know', a verb that has
the postag STE_VERB_LEXICAL_BASE usually has the LT postag VB. However,
although the verb 'do' has the LT postag VB, it does not have the postag
STE_VERB_LEXICAL_BASE. (It has the postags STE_VERB_AUXILIARY_DO and
STE_VERB_AUXILIARY_CAN_DO_MUST_WILL.) Thus, without excluding 'do' from a
rule, you cannot map STE_VERB_LEXICAL_BASE to VB.

Example 2. With an approved 2-word plural noun, the first word has the
postag STE_TN_NOUN_MULTI_WORD_PLURAL_1 and the second word has the postag
STE_TN_NOUN_MULTI_WORD_PLURAL_2. (TN is an abbreviation of 'Technical Name',
which is a term from the STE specification.) The 3 terms that follow are
approved 2-word nouns. The LT postags that relate to nouns are different for
the first word. The LT postags for nouns are in brackets:
circuit breakers (NN, NNS)
duty cycles (NN:UN, NNS)
operating systems (-, NNS)

In a related e-mail, Marcin wrote: Hm, that means I will have to look at
them and manually create a generic version, if that only is possible. That
is already a big help for me, as it's not trivial to find regularities that
create good disambiguation rules.

Marcin, if a partial mapping helps you, let me know, and I will define one.

Regards,

Mike Unwalla
Contact: www.techscribe.co.uk/techw/contact.htm 

-Original Message-
From: Dave Pawson [mailto:dave.paw...@gmail.com] 
Sent: 05 April 2014 19:50
To: development discussion for LanguageTool
Subject: Re: External rule files

On 5 April 2014 17:11, Mike Unwalla m...@techscribe.co.uk wrote:
snip
 Most of the rules that I developed are specifically for STE and contain
 customized postags. Example:
  token postag_regexp=yes

postag=STE_VERB_LEXICAL_BASE|STE_TVb_BASE|STE_TVb_2_WORD_BASE|PROJECT_TVb_B
 ASE|PROJECT_TVb_2_WORD_BASE/token

 The STE rules must be 'fail safe'. To develop rules that give correct
 results with all words in the English lexicon is difficult.

If you can define a transform I'll write a stylesheet to do it
(perhaps leaving the extra tags as comments)

HTH

snip


--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees_APR
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


RE: External rule files

2014-04-07 Thread Mike Unwalla
 and how you want these in the output, we can start from there.

I think that we have a miscommunication. I don't need a mapping from the STE
postags to the LT postags. I created the STE postags for the term checker
because I can't do what I want to do with only the LT postags.

 I need the XML source markup (is the source XML?)

The source is XML. It is available from
www.simplified-english.co.uk/installation.html in the file
term-checker-evaluation--mm-dd.zip (I do not give the current file name
in this e-mail because the .zip file name contains a date, and I put only
the most recent version of the file on the website.)

But, if 'source markup' means a marked up document in which terms are
annotated with a postag, then no, I do not have source markup. 

 I'm not sure I understand this... If you can express the conditions, then
I can
 write a transform based on those conditions.

Yes. (But I don't understand why someone would want this transformation.)

 E.g. (guessing)
   input STE_VERB_LEXICAL_BASE - VB
 
 input do   - VB
 Although that sounds too simple?

In principle, yes. But the mappings are much more complex. Also, there are
verbs that LT does not 'know' as verbs, such as the approved verb 'safety'.
And there is the not-approved verb 'safety-clip', for which there is no LT
postag (except for what it finds with the chunker
[http://wiki.languagetool.org/using-chunks]).

 then maps to ... Again I do not understand the English explanation,
 perhaps an XML example?
 following terms - are these XML children (nested within the parent)
 or siblings?

Sorry, I don't know how to give an XML example. There is no formal XML
specification for the STE postags. I used the method that is in 'Adding only
POS tags or tokens'
(http://wiki.languagetool.org/developing-a-disambiguator#toc8).

Regards,

Mike Unwalla
Contact: www.techscribe.co.uk/techw/contact.htm 



-Original Message-
From: Dave Pawson [mailto:dave.paw...@gmail.com] 
Sent: 07 April 2014 12:55
To: development discussion for LanguageTool
Subject: Re: External rule files

On 7 April 2014 11:08, Mike Unwalla m...@techscribe.co.uk wrote:
 Thanks Dave.

  I am not an XML expert. I understand the phrase 'define a transform' to
 mean 'specify a mapping'. If my understanding is not correct, please tell
 me.

That's right.
As a trial, if you give me a few examples,
and how you want these in the output, we can start from there.



 There is not a 1:1 mapping between the term checker postags and the LT
 postags. Thus, I cannot define a transform for all the postags, but I can
 define a transform for some of them. However, there are possible problems
as
 the examples below show.

I need the XML source markup (is the source XML?)
  XSLT works on XML in and XML out.



 Example 1. Ignoring technical verbs that LT does not 'know', a verb that
has
 the postag STE_VERB_LEXICAL_BASE usually has the LT postag VB. However,
 although the verb 'do' has the LT postag VB, it does not have the postag
 STE_VERB_LEXICAL_BASE. (It has the postags STE_VERB_AUXILIARY_DO and
 STE_VERB_AUXILIARY_CAN_DO_MUST_WILL.) Thus, without excluding 'do' from a
 rule, you cannot map STE_VERB_LEXICAL_BASE to VB.

I'm not sure I understand this... If you can express the conditions, then I
can
write a transform based on those conditions.
E.g. (guessing)
  input STE_VERB_LEXICAL_BASE - VB

input do   - VB
 Although that sounds too simple?





 Example 2. With an approved 2-word plural noun, the first word has the
 postag STE_TN_NOUN_MULTI_WORD_PLURAL_1 and the second word has the postag
 STE_TN_NOUN_MULTI_WORD_PLURAL_2. (TN is an abbreviation of 'Technical
Name',
 which is a term from the STE specification.) The 3 terms that follow are
 approved 2-word nouns. The LT postags that relate to nouns are different
for
 the first word. The LT postags for nouns are in brackets:
 circuit breakers (NN, NNS)
 duty cycles (NN:UN, NNS)
 operating systems (-, NNS)

STE_TN_NOUN_MULTI_WORD_PLURAL_1 + STE_TN_NOUN_MULTI_WORD_PLURAL_2
(written as
xsl:template
match=STE_TN_NOUN_MULTI_WORD_PLURAL_1[following-sibling::STE_TN_NOUN_MULTI_
WORD_PLURAL_2[1]]


then maps to ... Again I do not understand the English explanation,
perhaps an XML example?
following terms - are these XML children (nested within the parent)
or siblings?
p
  child/
/p
sibling/



regards





-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk


--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees_APR
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


--
Put Bad

Re: External rule files

2014-04-07 Thread Dave Pawson
On 7 April 2014 14:43, Mike Unwalla m...@techscribe.co.uk wrote:
 and how you want these in the output, we can start from there.

 I think that we have a miscommunication. I don't need a mapping from the STE
 postags to the LT postags. I created the STE postags for the term checker
 because I can't do what I want to do with only the LT postags.

Yes I think we do have an difference of understanding.


 I need the XML source markup (is the source XML?)

 The source is XML. It is available from
 www.simplified-english.co.uk/installation.html in the file
 term-checker-evaluation--mm-dd.zip (I do not give the current file name
 in this e-mail because the .zip file name contains a date, and I put only
 the most recent version of the file on the website.)

 But, if 'source markup' means a marked up document in which terms are
 annotated with a postag, then no, I do not have source markup.

No, I was thinking of the valid syntax of your form to that which is required?
Either a schema or DTD.
  Examples of marked up text would suffice, just take longer?



 I'm not sure I understand this... If you can express the conditions, then
 I can
 write a transform based on those conditions.

 Yes. (But I don't understand why someone would want this transformation.)

My assumption. I may be wrong.
You have many files marked up using schema A. (or simply a tagset A)
You want to transform these files to use a more recent LT tagset.

If we can share an understanding of the tagset, and how to get from one
to the other, I can help automate it.





 E.g. (guessing)
   input STE_VERB_LEXICAL_BASE - VB

 input do   - VB
 Although that sounds too simple?

 In principle, yes. But the mappings are much more complex. Also, there are
 verbs that LT does not 'know' as verbs, such as the approved verb 'safety'.
 And there is the not-approved verb 'safety-clip', for which there is no LT
 postag (except for what it finds with the chunker
 [http://wiki.languagetool.org/using-chunks]).

No problem. For 'unknowns' I will mark the items as unknown original=xxx
where xxx is the source markup.


 then maps to ... Again I do not understand the English explanation,
 perhaps an XML example?
 following terms - are these XML children (nested within the parent)
 or siblings?

 Sorry, I don't know how to give an XML example. There is no formal XML
 specification for the STE postags. I used the method that is in 'Adding only
 POS tags or tokens'
 (http://wiki.languagetool.org/developing-a-disambiguator#toc8).

The link points to XML? If that is not available, then XSLT will
not help?

regards

(Oh the joys of miscommunication :-)

Dave P





 -Original Message-
 From: Dave Pawson [mailto:dave.paw...@gmail.com]
 Sent: 07 April 2014 12:55
 To: development discussion for LanguageTool
 Subject: Re: External rule files

 On 7 April 2014 11:08, Mike Unwalla m...@techscribe.co.uk wrote:
 Thanks Dave.

  I am not an XML expert. I understand the phrase 'define a transform' to
 mean 'specify a mapping'. If my understanding is not correct, please tell
 me.

 That's right.
 As a trial, if you give me a few examples,
 and how you want these in the output, we can start from there.



 There is not a 1:1 mapping between the term checker postags and the LT
 postags. Thus, I cannot define a transform for all the postags, but I can
 define a transform for some of them. However, there are possible problems
 as
 the examples below show.

 I need the XML source markup (is the source XML?)
   XSLT works on XML in and XML out.



 Example 1. Ignoring technical verbs that LT does not 'know', a verb that
 has
 the postag STE_VERB_LEXICAL_BASE usually has the LT postag VB. However,
 although the verb 'do' has the LT postag VB, it does not have the postag
 STE_VERB_LEXICAL_BASE. (It has the postags STE_VERB_AUXILIARY_DO and
 STE_VERB_AUXILIARY_CAN_DO_MUST_WILL.) Thus, without excluding 'do' from a
 rule, you cannot map STE_VERB_LEXICAL_BASE to VB.

 I'm not sure I understand this... If you can express the conditions, then I
 can
 write a transform based on those conditions.
 E.g. (guessing)
   input STE_VERB_LEXICAL_BASE - VB

 input do   - VB
  Although that sounds too simple?





 Example 2. With an approved 2-word plural noun, the first word has the
 postag STE_TN_NOUN_MULTI_WORD_PLURAL_1 and the second word has the postag
 STE_TN_NOUN_MULTI_WORD_PLURAL_2. (TN is an abbreviation of 'Technical
 Name',
 which is a term from the STE specification.) The 3 terms that follow are
 approved 2-word nouns. The LT postags that relate to nouns are different
 for
 the first word. The LT postags for nouns are in brackets:
 circuit breakers (NN, NNS)
 duty cycles (NN:UN, NNS)
 operating systems (-, NNS)

 STE_TN_NOUN_MULTI_WORD_PLURAL_1 + STE_TN_NOUN_MULTI_WORD_PLURAL_2
 (written as
 xsl:template
 match=STE_TN_NOUN_MULTI_WORD_PLURAL_1[following-sibling::STE_TN_NOUN_MULTI_
 WORD_PLURAL_2[1]]
 

 then maps to ... Again I do not understand the English explanation,
 perhaps an XML

Re: External rule files

2014-04-07 Thread Marcin MiƂkowski
W dniu 2014-04-07 15:58, Dave Pawson pisze:
 On 7 April 2014 14:43, Mike Unwalla m...@techscribe.co.uk wrote:
 and how you want these in the output, we can start from there.

 I think that we have a miscommunication. I don't need a mapping from the STE
 postags to the LT postags. I created the STE postags for the term checker
 because I can't do what I want to do with only the LT postags.

 Yes I think we do have an difference of understanding.


 I need the XML source markup (is the source XML?)

 The source is XML. It is available from
 www.simplified-english.co.uk/installation.html in the file
 term-checker-evaluation--mm-dd.zip (I do not give the current file name
 in this e-mail because the .zip file name contains a date, and I put only
 the most recent version of the file on the website.)

 But, if 'source markup' means a marked up document in which terms are
 annotated with a postag, then no, I do not have source markup.

 No, I was thinking of the valid syntax of your form to that which is required?
 Either a schema or DTD.
Examples of marked up text would suffice, just take longer?



 I'm not sure I understand this... If you can express the conditions, then
 I can
 write a transform based on those conditions.

 Yes. (But I don't understand why someone would want this transformation.)

 My assumption. I may be wrong.
 You have many files marked up using schema A. (or simply a tagset A)
 You want to transform these files to use a more recent LT tagset.

 If we can share an understanding of the tagset, and how to get from one
 to the other, I can help automate it.


No, Mike does not want to transform or retag his files. He's using a 
specialized tagset, and that's fine. I simply want to steal some of his 
disambiguation rules, but for that, I'll have to use my brain instead of 
my Ctrl+C/Ctrl+V ;)

Best,
Marcin





 E.g. (guessing)
input STE_VERB_LEXICAL_BASE - VB

 input do   - VB
 Although that sounds too simple?

 In principle, yes. But the mappings are much more complex. Also, there are
 verbs that LT does not 'know' as verbs, such as the approved verb 'safety'.
 And there is the not-approved verb 'safety-clip', for which there is no LT
 postag (except for what it finds with the chunker
 [http://wiki.languagetool.org/using-chunks]).

 No problem. For 'unknowns' I will mark the items as unknown original=xxx
 where xxx is the source markup.


 then maps to ... Again I do not understand the English explanation,
 perhaps an XML example?
 following terms - are these XML children (nested within the parent)
 or siblings?

 Sorry, I don't know how to give an XML example. There is no formal XML
 specification for the STE postags. I used the method that is in 'Adding only
 POS tags or tokens'
 (http://wiki.languagetool.org/developing-a-disambiguator#toc8).

 The link points to XML? If that is not available, then XSLT will
 not help?

 regards

 (Oh the joys of miscommunication :-)

 Dave P





 -Original Message-
 From: Dave Pawson [mailto:dave.paw...@gmail.com]
 Sent: 07 April 2014 12:55
 To: development discussion for LanguageTool
 Subject: Re: External rule files

 On 7 April 2014 11:08, Mike Unwalla m...@techscribe.co.uk wrote:
 Thanks Dave.

   I am not an XML expert. I understand the phrase 'define a transform' to
 mean 'specify a mapping'. If my understanding is not correct, please tell
 me.

 That's right.
 As a trial, if you give me a few examples,
 and how you want these in the output, we can start from there.



 There is not a 1:1 mapping between the term checker postags and the LT
 postags. Thus, I cannot define a transform for all the postags, but I can
 define a transform for some of them. However, there are possible problems
 as
 the examples below show.

 I need the XML source markup (is the source XML?)
XSLT works on XML in and XML out.



 Example 1. Ignoring technical verbs that LT does not 'know', a verb that
 has
 the postag STE_VERB_LEXICAL_BASE usually has the LT postag VB. However,
 although the verb 'do' has the LT postag VB, it does not have the postag
 STE_VERB_LEXICAL_BASE. (It has the postags STE_VERB_AUXILIARY_DO and
 STE_VERB_AUXILIARY_CAN_DO_MUST_WILL.) Thus, without excluding 'do' from a
 rule, you cannot map STE_VERB_LEXICAL_BASE to VB.

 I'm not sure I understand this... If you can express the conditions, then I
 can
 write a transform based on those conditions.
 E.g. (guessing)
input STE_VERB_LEXICAL_BASE - VB

 input do   - VB
   Although that sounds too simple?





 Example 2. With an approved 2-word plural noun, the first word has the
 postag STE_TN_NOUN_MULTI_WORD_PLURAL_1 and the second word has the postag
 STE_TN_NOUN_MULTI_WORD_PLURAL_2. (TN is an abbreviation of 'Technical
 Name',
 which is a term from the STE specification.) The 3 terms that follow are
 approved 2-word nouns. The LT postags that relate to nouns are different
 for
 the first word. The LT postags for nouns are in brackets:
 circuit breakers (NN, NNS)
 duty cycles

RE: External rule files

2014-04-05 Thread Mike Unwalla
Hi All,

 But maybe the standard LT would benefit from your rules as well?

I am happy to donate all or some of the rules that I developed for STE issue
3. The most recent version of the rules is on
www.simplified-english.co.uk/installation.html. 

Most of the rules that I developed are specifically for STE and contain
customized postags. Example:
 token postag_regexp=yes
postag=STE_VERB_LEXICAL_BASE|STE_TVb_BASE|STE_TVb_2_WORD_BASE|PROJECT_TVb_B
ASE|PROJECT_TVb_2_WORD_BASE/token

The STE rules must be 'fail safe'. To develop rules that give correct
results with all words in the English lexicon is difficult. 

 I don't want to make the rule set for the journal part of the standard
distribution, as they quite specific. At the same time, I want to use
standard rules. So I simply want to open the additional rule set before I
make the check.

This is similar to my situation. Also, when I check a text, I use more than
one rule set. The STE rules that are on the simplified-english website are
the 'core', as defined by the STEMG (www.asd-ste100.org). For each project,
I have a grammar file and a disambiguation file
(www.simplified-english.co.uk/design.html has a picture). When I check a
text, I use both the core STE files and the project files.

Some scenarios for the use of user files are as follows:
* Single-user environment. User wants to use standalone LT and LT in
OpenOffice. Currently, the user must copy/paste the files from the
standalone directory to an OpenOffice directory. (Testrules is available
only with standalone, thus, to develop user rules, that version of LT is
always necessary.)
* Multi-user environment. Grammar and disambiguation files are on a server.
LT accesses these files only.
* Multi-user environment. Grammar and disambiguation files are on a server.
LT simultaneously accesses these files and project-specific grammar files
that are on a user's computer.

Possibly, one option is to split the disambiguation file into 2 parts. (And
similarly with the grammar file.) The first part is only a 'wrapper', which
refers to the default LT disambiguation file:

?xml version=1.0 encoding=utf-8?
!DOCTYPE doc [
!ENTITY DefaultLTDisambiguation SYSTEM
org/languagetool/resource/en/disambiguation-default.xml
]
rules lang=en xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
xsi:noNamespaceSchemaLocation=http://svn.code.sf.net/p/languagetool/code/tr
unk/languagetool/languagetool-core/src/main/resources/org/languagetool/resou
rce/disambiguation.xsd

DefaultLTDisambiguation; !-- The content of the current
disambiguation.xml, but without the rules element --

!--An explanation of how to add external entities goes here. --
/rules

'Out of the box', LT works as usual. However, a user can edit the 'wrapper'
disambiguation file to make LT use other rule sets. 

Possible problem 1: Because the user can install LT anywhere, the path for
DefaultLTDisambiguation must be relative to the installation directory. But,
that can cause a validating XML editor to show an error and not open the
file. If the user wants to use a validating XML editor, the solution is to
edit the file with the full path.

Possible problem 2: Dave Pawson suggested that xInclude is preferable to
entities (http://sourceforge.net/p/languagetool/mailman/message/32177932/).

Possible problem 3: Each time that the user updates LT, the user must edit
the 'wrapper' disambiguation file or copy/paste from the previous LT
version. (But, with the integrate attribute, presumably a user must specify
the location of the user file(s), so the same problem exists with that.)

Regards,

Mike Unwalla
Contact: www.techscribe.co.uk/techw/contact.htm 


-Original Message-
From: Marcin Milkowski [mailto:list-addr...@wp.pl] 
Sent: 05 April 2014 08:03
To: languagetool-devel@lists.sourceforge.net
Subject: Re: External rule files

W dniu 2014-04-04 19:24, Mike Unwalla pisze:
 Hi All,

   I'm not sure why Mike Unwalla doesn't want to use our disambiguation
 rules

 I do not have a fundamental objection to using the LT disambiguation file
 with the STE rules. Part of the reason that I now do not use the LT
 disambiguation rules is historical.

 The LT disambiguation rules are not sufficient for the STE term checker.
 Examples:
 * A part-of-speech disambiguator is necessary (primarily for noun/verb
 disambiguation).
 * Each term that is in the STE specification must be specified in the
 disambiguation rules with its approved and not-approved parts of speech.

 When I started to write the STE disambiguation rules, I did not know how
to
 add rules to an external file
 (http://wiki.languagetool.org/tips-and-tricks#toc2). Therefore, the
 disambiguation file was in installation
path\org\languagetool\resource\en.

 If I add the STE rules at the end of the LT disambiguation file, each time
 that I update LT, I must copy/paste the STE rules into the new LT
 disambiguation file. If some part of the new LT disambiguation has an
effect
 on the STE rules, I must change the STE

Re: External rule files

2014-04-05 Thread Dave Pawson
On 5 April 2014 17:11, Mike Unwalla m...@techscribe.co.uk wrote:
 Hi All,

 But maybe the standard LT would benefit from your rules as well?

 I am happy to donate all or some of the rules that I developed for STE issue
 3. The most recent version of the rules is on
 www.simplified-english.co.uk/installation.html.

 Most of the rules that I developed are specifically for STE and contain
 customized postags. Example:
  token postag_regexp=yes
 postag=STE_VERB_LEXICAL_BASE|STE_TVb_BASE|STE_TVb_2_WORD_BASE|PROJECT_TVb_B
 ASE|PROJECT_TVb_2_WORD_BASE/token

 The STE rules must be 'fail safe'. To develop rules that give correct
 results with all words in the English lexicon is difficult.

If you can define a transform I'll write a stylesheet to do it
(perhaps leaving the extra tags as comments)

HTH



 I don't want to make the rule set for the journal part of the standard
 distribution, as they quite specific. At the same time, I want to use
 standard rules. So I simply want to open the additional rule set before I
 make the check.

 This is similar to my situation. Also, when I check a text, I use more than
 one rule set. The STE rules that are on the simplified-english website are
 the 'core', as defined by the STEMG (www.asd-ste100.org). For each project,
 I have a grammar file and a disambiguation file
 (www.simplified-english.co.uk/design.html has a picture). When I check a
 text, I use both the core STE files and the project files.

 Some scenarios for the use of user files are as follows:
 * Single-user environment. User wants to use standalone LT and LT in
 OpenOffice. Currently, the user must copy/paste the files from the
 standalone directory to an OpenOffice directory. (Testrules is available
 only with standalone, thus, to develop user rules, that version of LT is
 always necessary.)
 * Multi-user environment. Grammar and disambiguation files are on a server.
 LT accesses these files only.
 * Multi-user environment. Grammar and disambiguation files are on a server.
 LT simultaneously accesses these files and project-specific grammar files
 that are on a user's computer.

 Possibly, one option is to split the disambiguation file into 2 parts. (And
 similarly with the grammar file.) The first part is only a 'wrapper', which
 refers to the default LT disambiguation file:

 ?xml version=1.0 encoding=utf-8?
 !DOCTYPE doc [
 !ENTITY DefaultLTDisambiguation SYSTEM
 org/languagetool/resource/en/disambiguation-default.xml
 ]
 rules lang=en xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
 xsi:noNamespaceSchemaLocation=http://svn.code.sf.net/p/languagetool/code/tr
 unk/languagetool/languagetool-core/src/main/resources/org/languagetool/resou
 rce/disambiguation.xsd

 DefaultLTDisambiguation; !-- The content of the current
 disambiguation.xml, but without the rules element --

 !--An explanation of how to add external entities goes here. --
 /rules

 'Out of the box', LT works as usual. However, a user can edit the 'wrapper'
 disambiguation file to make LT use other rule sets.

 Possible problem 1: Because the user can install LT anywhere, the path for
 DefaultLTDisambiguation must be relative to the installation directory. But,
 that can cause a validating XML editor to show an error and not open the
 file. If the user wants to use a validating XML editor, the solution is to
 edit the file with the full path.

 Possible problem 2: Dave Pawson suggested that xInclude is preferable to
 entities (http://sourceforge.net/p/languagetool/mailman/message/32177932/).

 Possible problem 3: Each time that the user updates LT, the user must edit
 the 'wrapper' disambiguation file or copy/paste from the previous LT
 version. (But, with the integrate attribute, presumably a user must specify
 the location of the user file(s), so the same problem exists with that.)

 Regards,

 Mike Unwalla
 Contact: www.techscribe.co.uk/techw/contact.htm


 -Original Message-
 From: Marcin Milkowski [mailto:list-addr...@wp.pl]
 Sent: 05 April 2014 08:03
 To: languagetool-devel@lists.sourceforge.net
 Subject: Re: External rule files

 W dniu 2014-04-04 19:24, Mike Unwalla pisze:
 Hi All,

   I'm not sure why Mike Unwalla doesn't want to use our disambiguation
 rules

 I do not have a fundamental objection to using the LT disambiguation file
 with the STE rules. Part of the reason that I now do not use the LT
 disambiguation rules is historical.

 The LT disambiguation rules are not sufficient for the STE term checker.
 Examples:
 * A part-of-speech disambiguator is necessary (primarily for noun/verb
 disambiguation).
 * Each term that is in the STE specification must be specified in the
 disambiguation rules with its approved and not-approved parts of speech.

 When I started to write the STE disambiguation rules, I did not know how
 to
 add rules to an external file
 (http://wiki.languagetool.org/tips-and-tricks#toc2). Therefore, the
 disambiguation file was in installation
 path\org\languagetool\resource\en.

 If I add the STE