Yeah, Fran, there are at least hundreds (maybe thousands) of them in the
corpus...

I don't know how and where to file and issue for that. I was gonna to
create an issue in apertium-tat, but it is a global thing, not just for
Tatar...

Am Di., 13. Nov. 2018 um 20:09 Uhr schrieb Francis Tyers <
fty...@prompsit.com>:

> El 2018-11-13 16:29, Kevin Brubeck Unhammer escribió:
> > mansur <6688...@gmail.com> čálii:
> >
> >> Hello!
> >>
> >> There are so many symbols that are not recognized by Apertium's tagger
> >> and
> >> not marked in any way. For example, apertium-tat does not recognize
> >> the
> >> following symbols:
> >> _ @ % ~ |
> >> and many others.
> >>
> >> Is it possible to use some special tag (^_/_<unknown>$) for such
> >> cases?
> >
> > Yes, just give them analyses in tat.dix, e.g.:
> >
> > <e><re>[_@%~|]</re><p><l/><r><s n="symb"/></r></p></e>
> >
> > (untested)
>
> I generally use <sym> for that, but there are a lot of Unicode symbols
> and it's impossible to list them all in the .dix file, there should be
> some kind of builtin for that I think.
>
> Fran
>
>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to