Hi Antonio, I have something mostly working, but have a few questions about what specifically you're after.
I guess to start things out, for поезд<n><loc>:поезд>{D}{A}:поезде, do you prefer поез>де or поезд>де, or something else? -- Jonathan ср, 6 мар. 2019 г. в 11:33, Francis Tyers <fty...@prompsit.com>: > El 2019-03-06 14:40, Antonio Toral escribió: > > Dear apertiumers, > > > > I would like to do morph segmentation for Kazakh and I've seen that > > this is possible with Apertium [1]. > > > > However, in the example shown in that webpage the output doesn't seem > > to be pure segmentation: > > > > $ echo "щеткадағы" | hfst-proc kaz.segmenter > > ^щеткадағы/щетка>{D}{A}{G}{I}$ > > > > Is it possible to obtain segmentation instead? I.e. > > щетка>дағы > > Hi Antonio, > > Thanks for your email! :D > > You're right that it isn't pure segmentation. There is some good news > and some bad news. > > The good news is that getting the 'pure' segmentation is definitely > possible > and without too much effort. > > Essentially the problem is that the way the > phonological rules are defined, some of them depend on 0 (empty) symbols > on the surface side of the string. The morpheme boundary currently > always > goes to empty, so if we set it to not go to empty, then some of those > rules will break. > > Fixing that means editting the rules to change the relevant contexts to > ask for > 0 aside from the morpheme boundary on the surface. This shouldn't take > too long. > > The bad news is that it isn't done yet, but given the fact that > it Kazakh is in WMT this year, it is definitely something we are > planning > to implement. Hopefully in the next couple of days. > > Regards, > > Fran > > > > > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff