W dniu 2014-04-05 18:11, Mike Unwalla pisze:
> Hi All,
>
>> But maybe the standard LT would benefit from your rules as well?
>
> I am happy to donate all or some of the rules that I developed for STE issue
> 3. The most recent version of the rules is on
> www.simplified-english.co.uk/installation.html.
>
> Most of the rules that I developed are specifically for STE and contain
> customized postags. Example:
>   <token postag_regexp="yes"
> postag="STE_VERB_LEXICAL_BASE|STE_TVb_BASE|STE_TVb_2_WORD_BASE|PROJECT_TVb_B
> ASE|PROJECT_TVb_2_WORD_BASE"></token>

Hm, that means I will have to look at them and manually create a generic 
version, if that only is possible. That is already a big help for me, as 
it's not trivial to find regularities that create good disambiguation rules.

> The STE rules must be 'fail safe'. To develop rules that give correct
> results with all words in the English lexicon is difficult.
>
>> I don't want to make the rule set for the journal part of the standard
> distribution, as they quite specific. At the same time, I want to use
> standard rules. So I simply want to open the additional rule set before I
> make the check.
>
> This is similar to my situation. Also, when I check a text, I use more than
> one rule set. The STE rules that are on the simplified-english website are
> the 'core', as defined by the STEMG (www.asd-ste100.org). For each project,
> I have a grammar file and a disambiguation file
> (www.simplified-english.co.uk/design.html has a picture). When I check a
> text, I use both the core STE files and the project files.
>
> Some scenarios for the use of user files are as follows:
> * Single-user environment. User wants to use standalone LT and LT in
> OpenOffice. Currently, the user must copy/paste the files from the
> standalone directory to an OpenOffice directory. (Testrules is available
> only with standalone, thus, to develop user rules, that version of LT is
> always necessary.)
> * Multi-user environment. Grammar and disambiguation files are on a server.
> LT accesses these files only.
> * Multi-user environment. Grammar and disambiguation files are on a server.
> LT simultaneously accesses these files and project-specific grammar files
> that are on a user's computer.
>
> Possibly, one option is to split the disambiguation file into 2 parts. (And
> similarly with the grammar file.) The first part is only a 'wrapper', which
> refers to the default LT disambiguation file:
>
> <?xml version="1.0" encoding="utf-8"?>
> <!DOCTYPE doc [
> <!ENTITY DefaultLTDisambiguation SYSTEM
> "org/languagetool/resource/en/disambiguation-default.xml">
> ]>
> <rules lang="en" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
> xsi:noNamespaceSchemaLocation="http://svn.code.sf.net/p/languagetool/code/tr
> unk/languagetool/languagetool-core/src/main/resources/org/languagetool/resou
> rce/disambiguation.xsd">
>
> &DefaultLTDisambiguation; <!-- The content of the current
> disambiguation.xml, but without the rules element -->
>
> <!--An explanation of how to add external entities goes here. -->
> </rules>
>
> 'Out of the box', LT works as usual. However, a user can edit the 'wrapper'
> disambiguation file to make LT use other rule sets.

Basically, this is a hack. For your scenarios, it would be really easier 
to be able:

(1) load a special rule set via the GUI (both in standalone and 
Libre/OpenOffice versions);

(2) save the configuration including the special rule set (i.e., the 
path) and if the file is still found, display it in the user interface 
as one of the available options;

(3) probably make it possible to customize the rule set, which means it 
should rather be a folder than a zip file, including some kind of 
manifest file, and optionally a disambiguation file and a rule file.

Then there will be no problems with inclusion anymore, and the file will 
validate fine (when we solve the outstanding problems with some 
attributes defined as IDREFs). As far as I can see, this will be much 
more flexible and would allow to adopt LT for many purposes in a way 
that is much more user-friendly and less error-prone.

Regards,
Marcin

------------------------------------------------------------------------------
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to