> and how you want these in the output, we can start from there.

I think that we have a miscommunication. I don't need a mapping from the STE
postags to the LT postags. I created the STE postags for the term checker
because I can't do what I want to do with only the LT postags.

> I need the XML source markup (is the source XML?)

The source is XML. It is available from
www.simplified-english.co.uk/installation.html in the file
term-checker-evaluation-yyyy-mm-dd.zip (I do not give the current file name
in this e-mail because the .zip file name contains a date, and I put only
the most recent version of the file on the website.)

But, if 'source markup' means a marked up document in which terms are
annotated with a postag, then no, I do not have source markup. 

> I'm not sure I understand this... If you can express the conditions, then
I can
> write a transform based on those conditions.

Yes. (But I don't understand why someone would want this transformation.)

> E.g. (guessing)
>   input <STE_VERB_LEXICAL_BASE> -> <VB>
> 
> input <do>   -> <VB>
> Although that sounds too simple?

In principle, yes. But the mappings are much more complex. Also, there are
verbs that LT does not 'know' as verbs, such as the approved verb 'safety'.
And there is the not-approved verb 'safety-clip', for which there is no LT
postag (except for what it finds with the chunker
[http://wiki.languagetool.org/using-chunks]).

> then maps to ... Again I do not understand the English explanation,
> perhaps an XML example?
> "following terms" - are these XML children (nested within the parent)
> or siblings?

Sorry, I don't know how to give an XML example. There is no formal XML
specification for the STE postags. I used the method that is in 'Adding only
POS tags or tokens'
(http://wiki.languagetool.org/developing-a-disambiguator#toc8).

Regards,

Mike Unwalla
Contact: www.techscribe.co.uk/techw/contact.htm 



-----Original Message-----
From: Dave Pawson [mailto:dave.paw...@gmail.com] 
Sent: 07 April 2014 12:55
To: development discussion for LanguageTool
Subject: Re: External rule files

On 7 April 2014 11:08, Mike Unwalla <m...@techscribe.co.uk> wrote:
> Thanks Dave.
>
>  I am not an XML expert. I understand the phrase 'define a transform' to
> mean 'specify a mapping'. If my understanding is not correct, please tell
> me.

That's right.
As a trial, if you give me a few examples,
and how you want these in the output, we can start from there.


>
> There is not a 1:1 mapping between the term checker postags and the LT
> postags. Thus, I cannot define a transform for all the postags, but I can
> define a transform for some of them. However, there are possible problems
as
> the examples below show.

I need the XML source markup (is the source XML?)
  XSLT works on XML in and XML out.


>
> Example 1. Ignoring technical verbs that LT does not 'know', a verb that
has
> the postag STE_VERB_LEXICAL_BASE usually has the LT postag VB. However,
> although the verb 'do' has the LT postag VB, it does not have the postag
> STE_VERB_LEXICAL_BASE. (It has the postags STE_VERB_AUXILIARY_DO and
> STE_VERB_AUXILIARY_CAN_DO_MUST_WILL.) Thus, without excluding 'do' from a
> rule, you cannot map STE_VERB_LEXICAL_BASE to VB.

I'm not sure I understand this... If you can express the conditions, then I
can
write a transform based on those conditions.
E.g. (guessing)
  input <STE_VERB_LEXICAL_BASE> -> <VB>

input <do>   -> <VB>
 Although that sounds too simple?




>
> Example 2. With an approved 2-word plural noun, the first word has the
> postag STE_TN_NOUN_MULTI_WORD_PLURAL_1 and the second word has the postag
> STE_TN_NOUN_MULTI_WORD_PLURAL_2. (TN is an abbreviation of 'Technical
Name',
> which is a term from the STE specification.) The 3 terms that follow are
> approved 2-word nouns. The LT postags that relate to nouns are different
for
> the first word. The LT postags for nouns are in brackets:
> circuit breakers (NN, NNS)
> duty cycles (NN:UN, NNS)
> operating systems (-, NNS)

<STE_TN_NOUN_MULTI_WORD_PLURAL_1> + <STE_TN_NOUN_MULTI_WORD_PLURAL_2>
(written as
<xsl:template
match="STE_TN_NOUN_MULTI_WORD_PLURAL_1[following-sibling::STE_TN_NOUN_MULTI_
WORD_PLURAL_2[1]]
">

then maps to ... Again I do not understand the English explanation,
perhaps an XML example?
"following terms" - are these XML children (nested within the parent)
or siblings?
<p>
  <child/>
</p>
<sibling/>



regards





-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk

----------------------------------------------------------------------------
--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees_APR
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees_APR
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to