On Tue, Mar 7, 2023 at 6:07 AM Kevin Brubeck Unhammer <unham...@fsfe.org> wrote:
>
> Daniel Swanson
> <awesomeevildu...@gmail.com> čálii:
>
> > Greetings Apertiumers!
> >
> > This morning I set out to change the Ancient Hebrew analyzer from
> > Latin script to Hebrew script (a task I don't wish upon anyone) and in
> > the process produced a search-and-replace tool that understands the
> > structure of several of our source files:
> > https://github.com/mr-martian/apertium-grep
>
> Awesome!
>
> > This script could, without too much trouble, be expanded to cover the
> > rest of our source files, at which point I would like to propose that
> > we move towards greater standardization of our tagset:
> > https://wiki.apertium.org/wiki/List_of_symbols
> >
> > At minimum, I would like to deal with some of the duplicate tags, like
> > impf/imperf, rec/res, v/vblex, pass/pasv, etc.
>
> That would be great! I'll put in a vote for pasv right now.
>
> > My preference would be that we also consider splitting compound tags,
> > like the tense+mood (fti, fts, pii, pis) and maybe possessor and
> > subject tags (px1sg, s_1sg).
>
> It makes sense to split tense and mood, as well as number and person,
> but I doubt it can be done automatically – it will require careful
> changes to CG and transfer. Might make sense to try it on one language
> pair along with the maintainer and see how it goes.
>
> It would be very dangerous to turn <pxsg> into <px><sg> – that would
> break lots of CG and transfer rules and possibly lead to more complexity
> in tag matching since you now have to always check for the existence of
> <px> whereever you check for <sg> etc.

To be clear, I meant splitting <px1sg> into <px1><pxsg>.

One of my ideals for the tagset is that every tag be
position-independent, so that the only reason I need to care about
order is because of FST topology (and maybe not even then).

Daniel


_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to