No dixes will be harmed during this procedure. Nobody has to touch any existing language files for this work to be incredibly useful. The proposal is to allow the stream to carry secondary information. This secondary information can come from anywhere, and will mostly be dynamic.
Initially, the tokeniser will dynamically append the surface form as secondary information to the output stream, which the generator will use if no translation was available. In the future, this can be used for so many things, including secondary information that authors write into the dixes. -- Tino Didriksen On Sun, 29 Mar 2020 at 07:07, Hèctor Alòs i Font <hectora...@gmail.com> wrote: > Hi Tanmai, > > I am surprised by this proposal. It involves some very important changes > that should be better justified. I don't quite understand when should one > define the "optional secondary information" in addition to the current > morphological fields. Will it be in the language module (apertium-xxx) or > in each of the translation modules (apertium-xxx-yyy)? Part of the problem > may be in the example. I can't imagine why information on case should be > added to every English word (not much that, say, information about > belonging, which is common for Turkic languages). Should this kind of > unnecessary information for everybody, or almost everybody, will be found > in every language pair using, say, English if someone for his or her > specific purposes will like to add it? As far as I understand, for the > given project it is needed to add the surface form of the word. This seems > quite logical. Moreover, this information may be useful for e.g. lexical > selection and structural transfer. But more than that seems to me too > obscure. > > Best, > Hèctor >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff