Thanks Dave. I am not an XML expert. I understand the phrase 'define a transform' to mean 'specify a mapping'. If my understanding is not correct, please tell me.
There is not a 1:1 mapping between the term checker postags and the LT postags. Thus, I cannot define a transform for all the postags, but I can define a transform for some of them. However, there are possible problems as the examples below show. Example 1. Ignoring technical verbs that LT does not 'know', a verb that has the postag STE_VERB_LEXICAL_BASE usually has the LT postag VB. However, although the verb 'do' has the LT postag VB, it does not have the postag STE_VERB_LEXICAL_BASE. (It has the postags STE_VERB_AUXILIARY_DO and STE_VERB_AUXILIARY_CAN_DO_MUST_WILL.) Thus, without excluding 'do' from a rule, you cannot map STE_VERB_LEXICAL_BASE to VB. Example 2. With an approved 2-word plural noun, the first word has the postag STE_TN_NOUN_MULTI_WORD_PLURAL_1 and the second word has the postag STE_TN_NOUN_MULTI_WORD_PLURAL_2. (TN is an abbreviation of 'Technical Name', which is a term from the STE specification.) The 3 terms that follow are approved 2-word nouns. The LT postags that relate to nouns are different for the first word. The LT postags for nouns are in brackets: circuit breakers (NN, NNS) duty cycles (NN:UN, NNS) operating systems (-, NNS) In a related e-mail, Marcin wrote: Hm, that means I will have to look at them and manually create a generic version, if that only is possible. That is already a big help for me, as it's not trivial to find regularities that create good disambiguation rules. Marcin, if a partial mapping helps you, let me know, and I will define one. Regards, Mike Unwalla Contact: www.techscribe.co.uk/techw/contact.htm -----Original Message----- From: Dave Pawson [mailto:dave.paw...@gmail.com] Sent: 05 April 2014 19:50 To: development discussion for LanguageTool Subject: Re: External rule files On 5 April 2014 17:11, Mike Unwalla <m...@techscribe.co.uk> wrote: <snip> > Most of the rules that I developed are specifically for STE and contain > customized postags. Example: > <token postag_regexp="yes" > postag="STE_VERB_LEXICAL_BASE|STE_TVb_BASE|STE_TVb_2_WORD_BASE|PROJECT_TVb_B > ASE|PROJECT_TVb_2_WORD_BASE"></token> > > The STE rules must be 'fail safe'. To develop rules that give correct > results with all words in the English lexicon is difficult. If you can define a transform I'll write a stylesheet to do it (perhaps leaving the extra tags as comments) HTH <snip> ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees_APR _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel