On 7 April 2014 14:43, Mike Unwalla <m...@techscribe.co.uk> wrote:
>> and how you want these in the output, we can start from there.
>
> I think that we have a miscommunication. I don't need a mapping from the STE
> postags to the LT postags. I created the STE postags for the term checker
> because I can't do what I want to do with only the LT postags.

Yes I think we do have an difference of understanding.

>
>> I need the XML source markup (is the source XML?)
>
> The source is XML. It is available from
> www.simplified-english.co.uk/installation.html in the file
> term-checker-evaluation-yyyy-mm-dd.zip (I do not give the current file name
> in this e-mail because the .zip file name contains a date, and I put only
> the most recent version of the file on the website.)
>
> But, if 'source markup' means a marked up document in which terms are
> annotated with a postag, then no, I do not have source markup.

No, I was thinking of the valid syntax of your form to that which is required?
Either a schema or DTD.
  Examples of marked up text would suffice, just take longer?


>
>> I'm not sure I understand this... If you can express the conditions, then
> I can
>> write a transform based on those conditions.
>
> Yes. (But I don't understand why someone would want this transformation.)

My assumption. I may be wrong.
You have many files marked up using schema A. (or simply a tagset A)
You want to transform these files to use a more recent LT tagset.

If we can share an understanding of the tagset, and how to get from one
to the other, I can help automate it.




>
>> E.g. (guessing)
>>   input <STE_VERB_LEXICAL_BASE> -> <VB>
>>
>> input <do>   -> <VB>
>> Although that sounds too simple?
>
> In principle, yes. But the mappings are much more complex. Also, there are
> verbs that LT does not 'know' as verbs, such as the approved verb 'safety'.
> And there is the not-approved verb 'safety-clip', for which there is no LT
> postag (except for what it finds with the chunker
> [http://wiki.languagetool.org/using-chunks]).

No problem. For 'unknowns' I will mark the items as <unknown original="xxx">
where xxx is the source markup.

>
>> then maps to ... Again I do not understand the English explanation,
>> perhaps an XML example?
>> "following terms" - are these XML children (nested within the parent)
>> or siblings?
>
> Sorry, I don't know how to give an XML example. There is no formal XML
> specification for the STE postags. I used the method that is in 'Adding only
> POS tags or tokens'
> (http://wiki.languagetool.org/developing-a-disambiguator#toc8).

The link points to XML? If that is not available, then XSLT will
not help?

regards

(Oh the joys of miscommunication :-)

Dave P




>
> -----Original Message-----
> From: Dave Pawson [mailto:dave.paw...@gmail.com]
> Sent: 07 April 2014 12:55
> To: development discussion for LanguageTool
> Subject: Re: External rule files
>
> On 7 April 2014 11:08, Mike Unwalla <m...@techscribe.co.uk> wrote:
>> Thanks Dave.
>>
>>  I am not an XML expert. I understand the phrase 'define a transform' to
>> mean 'specify a mapping'. If my understanding is not correct, please tell
>> me.
>
> That's right.
> As a trial, if you give me a few examples,
> and how you want these in the output, we can start from there.
>
>
>>
>> There is not a 1:1 mapping between the term checker postags and the LT
>> postags. Thus, I cannot define a transform for all the postags, but I can
>> define a transform for some of them. However, there are possible problems
> as
>> the examples below show.
>
> I need the XML source markup (is the source XML?)
>   XSLT works on XML in and XML out.
>
>
>>
>> Example 1. Ignoring technical verbs that LT does not 'know', a verb that
> has
>> the postag STE_VERB_LEXICAL_BASE usually has the LT postag VB. However,
>> although the verb 'do' has the LT postag VB, it does not have the postag
>> STE_VERB_LEXICAL_BASE. (It has the postags STE_VERB_AUXILIARY_DO and
>> STE_VERB_AUXILIARY_CAN_DO_MUST_WILL.) Thus, without excluding 'do' from a
>> rule, you cannot map STE_VERB_LEXICAL_BASE to VB.
>
> I'm not sure I understand this... If you can express the conditions, then I
> can
> write a transform based on those conditions.
> E.g. (guessing)
>   input <STE_VERB_LEXICAL_BASE> -> <VB>
>
> input <do>   -> <VB>
>  Although that sounds too simple?
>
>
>
>
>>
>> Example 2. With an approved 2-word plural noun, the first word has the
>> postag STE_TN_NOUN_MULTI_WORD_PLURAL_1 and the second word has the postag
>> STE_TN_NOUN_MULTI_WORD_PLURAL_2. (TN is an abbreviation of 'Technical
> Name',
>> which is a term from the STE specification.) The 3 terms that follow are
>> approved 2-word nouns. The LT postags that relate to nouns are different
> for
>> the first word. The LT postags for nouns are in brackets:
>> circuit breakers (NN, NNS)
>> duty cycles (NN:UN, NNS)
>> operating systems (-, NNS)
>
> <STE_TN_NOUN_MULTI_WORD_PLURAL_1> + <STE_TN_NOUN_MULTI_WORD_PLURAL_2>
> (written as
> <xsl:template
> match="STE_TN_NOUN_MULTI_WORD_PLURAL_1[following-sibling::STE_TN_NOUN_MULTI_
> WORD_PLURAL_2[1]]
> ">
>
> then maps to ... Again I do not understand the English explanation,
> perhaps an XML example?
> "following terms" - are these XML children (nested within the parent)
> or siblings?
> <p>
>   <child/>
> </p>
> <sibling/>
>
>
>
> regards
>
>
>
>
>
> --
> Dave Pawson
> XSLT XSL-FO FAQ.
> Docbook FAQ.
> http://www.dpawson.co.uk
>
> ----------------------------------------------------------------------------
> --
> Put Bad Developers to Shame
> Dominate Development with Jenkins Continuous Integration
> Continuously Automate Build, Test & Deployment
> Start a new project now. Try Jenkins in the cloud.
> http://p.sf.net/sfu/13600_Cloudbees_APR
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>
>
> ------------------------------------------------------------------------------
> Put Bad Developers to Shame
> Dominate Development with Jenkins Continuous Integration
> Continuously Automate Build, Test & Deployment
> Start a new project now. Try Jenkins in the cloud.
> http://p.sf.net/sfu/13600_Cloudbees_APR
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel



-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk

------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees_APR
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to