> and how you want these in the output, we can start from there. I think that we have a miscommunication. I don't need a mapping from the STE postags to the LT postags. I created the STE postags for the term checker because I can't do what I want to do with only the LT postags.
> I need the XML source markup (is the source XML?) The source is XML. It is available from www.simplified-english.co.uk/installation.html in the file term-checker-evaluation-yyyy-mm-dd.zip (I do not give the current file name in this e-mail because the .zip file name contains a date, and I put only the most recent version of the file on the website.) But, if 'source markup' means a marked up document in which terms are annotated with a postag, then no, I do not have source markup. > I'm not sure I understand this... If you can express the conditions, then I can > write a transform based on those conditions. Yes. (But I don't understand why someone would want this transformation.) > E.g. (guessing) > input <STE_VERB_LEXICAL_BASE> -> <VB> > > input <do> -> <VB> > Although that sounds too simple? In principle, yes. But the mappings are much more complex. Also, there are verbs that LT does not 'know' as verbs, such as the approved verb 'safety'. And there is the not-approved verb 'safety-clip', for which there is no LT postag (except for what it finds with the chunker [http://wiki.languagetool.org/using-chunks]). > then maps to ... Again I do not understand the English explanation, > perhaps an XML example? > "following terms" - are these XML children (nested within the parent) > or siblings? Sorry, I don't know how to give an XML example. There is no formal XML specification for the STE postags. I used the method that is in 'Adding only POS tags or tokens' (http://wiki.languagetool.org/developing-a-disambiguator#toc8). Regards, Mike Unwalla Contact: www.techscribe.co.uk/techw/contact.htm -----Original Message----- From: Dave Pawson [mailto:dave.paw...@gmail.com] Sent: 07 April 2014 12:55 To: development discussion for LanguageTool Subject: Re: External rule files On 7 April 2014 11:08, Mike Unwalla <m...@techscribe.co.uk> wrote: > Thanks Dave. > > I am not an XML expert. I understand the phrase 'define a transform' to > mean 'specify a mapping'. If my understanding is not correct, please tell > me. That's right. As a trial, if you give me a few examples, and how you want these in the output, we can start from there. > > There is not a 1:1 mapping between the term checker postags and the LT > postags. Thus, I cannot define a transform for all the postags, but I can > define a transform for some of them. However, there are possible problems as > the examples below show. I need the XML source markup (is the source XML?) XSLT works on XML in and XML out. > > Example 1. Ignoring technical verbs that LT does not 'know', a verb that has > the postag STE_VERB_LEXICAL_BASE usually has the LT postag VB. However, > although the verb 'do' has the LT postag VB, it does not have the postag > STE_VERB_LEXICAL_BASE. (It has the postags STE_VERB_AUXILIARY_DO and > STE_VERB_AUXILIARY_CAN_DO_MUST_WILL.) Thus, without excluding 'do' from a > rule, you cannot map STE_VERB_LEXICAL_BASE to VB. I'm not sure I understand this... If you can express the conditions, then I can write a transform based on those conditions. E.g. (guessing) input <STE_VERB_LEXICAL_BASE> -> <VB> input <do> -> <VB> Although that sounds too simple? > > Example 2. With an approved 2-word plural noun, the first word has the > postag STE_TN_NOUN_MULTI_WORD_PLURAL_1 and the second word has the postag > STE_TN_NOUN_MULTI_WORD_PLURAL_2. (TN is an abbreviation of 'Technical Name', > which is a term from the STE specification.) The 3 terms that follow are > approved 2-word nouns. The LT postags that relate to nouns are different for > the first word. The LT postags for nouns are in brackets: > circuit breakers (NN, NNS) > duty cycles (NN:UN, NNS) > operating systems (-, NNS) <STE_TN_NOUN_MULTI_WORD_PLURAL_1> + <STE_TN_NOUN_MULTI_WORD_PLURAL_2> (written as <xsl:template match="STE_TN_NOUN_MULTI_WORD_PLURAL_1[following-sibling::STE_TN_NOUN_MULTI_ WORD_PLURAL_2[1]] "> then maps to ... Again I do not understand the English explanation, perhaps an XML example? "following terms" - are these XML children (nested within the parent) or siblings? <p> <child/> </p> <sibling/> regards -- Dave Pawson XSLT XSL-FO FAQ. Docbook FAQ. http://www.dpawson.co.uk ---------------------------------------------------------------------------- -- Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees_APR _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees_APR _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel