Dear Apertiumers, my friend Àngel Calpe (copied) is trying to lemmatize pre-normative Valencian texts using a modified version of apertium-cat (morphological analyser and tagger). The texts are written in an XML application called TEI [1].
He would like the words analysed (in particular those marked in <emph>...</emph>) to be wrapped in an element that has the lemma as an attribute (I am not sure, they could be attributes of <emph> or enclosed in an additional element, but that is probably a detail). Can you think of an easy hack that could be used to do this? Do we have anything that we could repurpose for that? Thanks a million All the best, Mikel [1] http://www.tei-c.org/ -- Mikel L. Forcada http://www.dlsi.ua.es/~mlf/ Departament de Llenguatges i Sistemes Informàtics Universitat d'Alacant E-03690 Sant Vicent del Raspeig Spain Office: +34 96 590 9776 ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
