Hi there,
> I would like to attach attributes to lemmas. Only a few but maybe there
> could be more, so a kind of introducing an attribute name would be nice,
> instead of having a predefined set of attribute names..
Lemmas as such aren't represented as such in Apertium dictionaries. They 
are part of the lexical forms (one could say that the lemma is the 
material from the beginning of the lexical form up to where the first 
part-of-speech tag appears. For instance, for surface form "thought" an 
English dictionary would derive the lexical forms "thought<n><sg>" and 
"think<vblex>...". The lemmas would then be "thought" and "think". There 
is a attribute lm="...." in some entries, but it is optional.
> I believe there are already lemma attributes, such as the word class of
> the lemma: noun, verb, adjective, adverb etc.
Not for lemmas. Lemma information is encoded either as the content of 
the element (see above). Part of speech as well as other morphological 
information is encoded as attributes of the <s> (symbol element).
> what I have in mind is to attach data from wordnet, such as sense,
> hypernym, hyponum, holonym, meromnym, and also combine it with the
> Swedish SALDO attributes of father and mother relations.
>
> The idea is then to choose a sense of a homonym based on the shortest
> distance to maybe the previous and following five words.
>
> a lemma may have more than one sense. Eg 'nut' may mean several things
> such as the offspring of a plant, nuts and bolts, and testicles.
>
> Is this easy to do? How do I do it?
I think the attribute lm="...." could be stretched a bit to have any 
value, which could be used to identify the lemma in another structure 
which could contain all of these (for instance, giving an XPath to 
another XML file containing all the desired information).

Perhaps it would be better to have some kind of new general purpose 
attribute that could be used to attach *standoff* information of this 
kind to any entry <e>.

Fran is working on lexical selection and I'm sure his opinion would be 
interesting to read!

All the best

Mikel
> best regards
> keld
>
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff


-- 
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326


------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to