Yes, Mikel, you're right—I was thinking that too.  Adding in that morpheme
boundary should be trivial.

--
Jonathan

ср, 6 мар. 2019 г. в 16:28, Mikel L. Forcada <m...@dlsi.ua.es>:

> Cool!
>
> Mikel
>
> P.S. Actually, why not щетка- -да- -ғы (щетка+locative+adjectivizer)?
>
> El 6/3/19 a les 17:32, Francis Tyers ha escrit:
> > El 2019-03-06 14:40, Antonio Toral escribió:
> >> Dear apertiumers,
> >>
> >> I would like to do morph segmentation for Kazakh and I've seen that
> >> this is possible with Apertium [1].
> >>
> >> However, in the example shown in that webpage the output doesn't seem
> >> to be pure segmentation:
> >>
> >> $ echo "щеткадағы" | hfst-proc kaz.segmenter
> >> ^щеткадағы/щетка>{D}{A}{G}{I}$
> >>
> >> Is it possible to obtain segmentation instead? I.e.
> >> c
> >
> > Hi Antonio,
> >
> > Thanks for your email! :D
> >
> > You're right that it isn't pure segmentation. There is some good news
> > and some bad news.
> >
> > The good news is that getting the 'pure' segmentation is definitely
> > possible
> > and without too much effort.
> >
> > Essentially the problem is that the way the
> > phonological rules are defined, some of them depend on 0 (empty) symbols
> > on the surface side of the string. The morpheme boundary currently always
> > goes to empty, so if we set it to not go to empty, then some of those
> > rules will break.
> >
> > Fixing that means editting the rules to change the relevant contexts
> > to ask for
> > 0 aside from the morpheme boundary on the surface. This shouldn't take
> > too long.
> >
> > The bad news is that it isn't done yet, but given the fact that
> > it Kazakh is in WMT this year, it is definitely something we are planning
> > to implement. Hopefully in the next couple of days.
> >
> > Regards,
> >
> > Fran
> >
> >
> >
> >
> > _______________________________________________
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
> --
> Mikel L. Forcada  http://www.dlsi.ua.es/~mlf/
> Departament de Llenguatges i Sistemes Informàtics
> Universitat d'Alacant
> E-03690 Sant Vicent del Raspeig
> Spain
> Office: +34 96 590 9776
>
>
>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to