Dear Apertiumers,

my friend Àngel Calpe (copied) is trying to lemmatize pre-normative 
Valencian texts using a modified version of apertium-cat (morphological 
analyser and tagger). The texts are written in an XML application called 
TEI [1].

He would like the words analysed (in particular those marked in 
<emph>...</emph>) to be wrapped in an element that has the lemma as an 
attribute (I am not sure, they could be attributes of <emph> or enclosed 
in an additional element, but that is probably a detail).

Can you think of an easy hack that could be used to do this? Do we have 
anything that we could repurpose for that?

Thanks a million

All the best,

Mikel


[1] http://www.tei-c.org/

-- 
Mikel L. Forcada  http://www.dlsi.ua.es/~mlf/
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
E-03690 Sant Vicent del Raspeig
Spain
Office: +34 96 590 9776


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to