Hi All,

>  I'm not sure why Mike Unwalla doesn't want to use our disambiguation
rules

I do not have a fundamental objection to using the LT disambiguation file
with the STE rules. Part of the reason that I now do not use the LT
disambiguation rules is historical.

The LT disambiguation rules are not sufficient for the STE term checker.
Examples:
* A part-of-speech disambiguator is necessary (primarily for noun/verb
disambiguation).
* Each term that is in the STE specification must be specified in the
disambiguation rules with its approved and not-approved parts of speech.

When I started to write the STE disambiguation rules, I did not know how to
add rules to an external file
(http://wiki.languagetool.org/tips-and-tricks#toc2). Therefore, the
disambiguation file was in <installation path>\org\languagetool\resource\en.

If I add the STE rules at the end of the LT disambiguation file, each time
that I update LT, I must copy/paste the STE rules into the new LT
disambiguation file. If some part of the new LT disambiguation has an effect
on the STE rules, I must change the STE rules. Most of the rules in the LT
disambiguation file are not applicable to the STE rules. Therefore, my
easiest option was to write a completely new disambiguation file.

>  Maybe there's place for a third value, when you want to use the existing
language with its tokenization, tagger and all, but you don't want to use
its rules (integrate="replace_only_rules").

What is the difference between this third option and the second option,
where you replace the LT rules with customized rules?

> I also added a new (now unused) attribute to <rules> element but the idea
is simple: If you have integrate="add", then rules will be added, 

Why is this attribute necessary? What are the problems with an external rule
file (http://wiki.languagetool.org/tips-and-tricks#toc2) that the new
attribute solves?

Thanks and regards,

Mike Unwalla
Contact: www.techscribe.co.uk/techw/contact.htm 



-----Original Message-----
From: Marcin Milkowski [mailto:list-addr...@wp.pl] 
Sent: 04 April 2014 11:44
To: languagetool-devel
Subject: External rule files
<snip>

I also added a new (now unused) attribute to <rules> element but the 
idea is simple: If you have integrate="add", then rules will be added, 
rather than replace the existing ones (integrate="replace", the default 
value?). Maybe there's place for a third value, when you want to use the 
existing language with its tokenization, tagger and all, but you don't 
want to use its rules (integrate="replace_only_rules"). I will add the 
code that supports the attributes as soon as we're clear which ones we need.

Any ideas? I think this is related to STE term checker, which could 
probably benefit from the third option -- I'm not sure why Mike Unwalla 
doesn't want to use our disambiguation rules, however. But if someone 
needs a special scenario in which disambiguation rules are added, we 
could have a special .zip format with two files: grammar and 
disambiguation. This is however a bit more complex.

Regards,
Marcin

----------------------------------------------------------------------------
--


------------------------------------------------------------------------------
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to