Yes, Mikel, you're right—I was thinking that too. Adding in that morpheme boundary should be trivial.
-- Jonathan ср, 6 мар. 2019 г. в 16:28, Mikel L. Forcada <m...@dlsi.ua.es>: > Cool! > > Mikel > > P.S. Actually, why not щетка- -да- -ғы (щетка+locative+adjectivizer)? > > El 6/3/19 a les 17:32, Francis Tyers ha escrit: > > El 2019-03-06 14:40, Antonio Toral escribió: > >> Dear apertiumers, > >> > >> I would like to do morph segmentation for Kazakh and I've seen that > >> this is possible with Apertium [1]. > >> > >> However, in the example shown in that webpage the output doesn't seem > >> to be pure segmentation: > >> > >> $ echo "щеткадағы" | hfst-proc kaz.segmenter > >> ^щеткадағы/щетка>{D}{A}{G}{I}$ > >> > >> Is it possible to obtain segmentation instead? I.e. > >> c > > > > Hi Antonio, > > > > Thanks for your email! :D > > > > You're right that it isn't pure segmentation. There is some good news > > and some bad news. > > > > The good news is that getting the 'pure' segmentation is definitely > > possible > > and without too much effort. > > > > Essentially the problem is that the way the > > phonological rules are defined, some of them depend on 0 (empty) symbols > > on the surface side of the string. The morpheme boundary currently always > > goes to empty, so if we set it to not go to empty, then some of those > > rules will break. > > > > Fixing that means editting the rules to change the relevant contexts > > to ask for > > 0 aside from the morpheme boundary on the surface. This shouldn't take > > too long. > > > > The bad news is that it isn't done yet, but given the fact that > > it Kazakh is in WMT this year, it is definitely something we are planning > > to implement. Hopefully in the next couple of days. > > > > Regards, > > > > Fran > > > > > > > > > > _______________________________________________ > > Apertium-stuff mailing list > > Apertium-stuff@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > > -- > Mikel L. Forcada http://www.dlsi.ua.es/~mlf/ > Departament de Llenguatges i Sistemes Informàtics > Universitat d'Alacant > E-03690 Sant Vicent del Raspeig > Spain > Office: +34 96 590 9776 > > > > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff