Re: [Apertium-stuff] Apertium's Wider Use & Secondary Tags

Jonathan Washington Sat, 13 Jun 2020 15:19:07 -0700

On Sat, Jun 13, 2020, 16:05 Francis Tyers <fty...@prompsit.com> wrote:


> El 2020-06-13 19:31, Xavi Ivars escribió:
> > Before anything, let me say that I like the proposal to enhance the
> > pipeline with more data (including, but not limited to the surface
> > forms), to be able to do properly do things that currently we're doing
> > in veeeery hacky (to me) and definitely non-linguistic ways
> >
> >> xavi@dell:~/src/apertium-spa$ echo "El mango" | apertium -d .
> >> spa-morph
> >> ^El/el<det><def><m><sg>$
> >>
> >
> ^mango/mango<n><m><sg>/mangar<vblex><pri><p1><sg>/MANGO_FRUTA<N><M><SG>$^./.<sent>$
> >
> > In this example, we "add" semantic information to the pipeline (and
> > disambiguate via CG3) by creating a "fake lemma" needed for SPA-CAT,
> > because "mango<n>" (pan stick) and "mango_fruta<n>" are translated
> > differently in Catalan. But this, in turn, forces every other language
> > pair using Spanish to know about "mango_fruta<n>" even if the
> > translation was the same as "mango<n>".
> >
>
> What is the problem here? That "mango" has two possible lemmas and
> paradigms
>   in Spanish?
>
> The way that I've treated that is to have mango¹ and mango², like in a
> traditional dictionary. I don't think that this requires any further

information.
>

I think Xavi's point is that there are a number of ways to approach this,
and having the option of another stream to put this extra information could
be one of them.  Imho, it is nicer in many ways than even having (very
arbitrary) superscripts (that aren't really any better to have in a
morphological analysis than _fruta).

--
Jonathan

_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Apertium's Wider Use & Secondary Tags

Reply via email to