Hi Fran,
The problem here is that in the tsx file, there are prohibitions for DET
+ DET:
<label-sequence>
<label-item label="DETM"/>
<label-item label="DETM"/>
</label-sequence>
<label-sequence>
<label-item label="DETM"/>
<label-item label="DETF"/>
</label-sequence>
<label-sequence>
<label-item label="DETM"/>
<label-item label="DETMF"/>
</label-sequence>
<label-sequence>
<label-item label="DETF"/>
<label-item label="DETM"/>
</label-sequence>
(etc.)
that's why the tagger doesn't choose 'qualsevol' (determiner). If you
add the other forbids, I don't know if it will work because of these
forbids.
On the other hand, in the tagger we don't have a different category for
det.ind or adj.ind, only categories for all adjs and dets,
differentiating between genders (ADJM, ADJF, ADJMF, DETM, etc.).
I tried once to train the tagger adding new categories for dets, because
I had a similar problem in es-en. I divided the determiners in two
categories (the ones like "qualsevol" that could precede another
determiner, and the others) and didn't take into account the gender. The
results were not good :-(.
About this:
> FORBID prn.tn + adj.ind
> FORBID prn.tn + det.ind
>
> I can't think of any examples of this ... but maybe someone else can :)
>
"arribaré a ser com tu algun dia?" (prn.tn + det.ind)
"jo parlo amb ell cada dia" (prn.tn + det.ind)
the combination prn.tn + adj.ind is more unusual, you could add this
forbid, althoug some examples do exist:
"ha guanyat ell altre cop"
Mireia
El dc 26 de 01 de 2011 a les 11:59 +0000, en/na Francis Tyers va
escriure:
> Hey all,
>
> Translating some text from Catalan to Spanish I get a tagging error:
>
> --
>
> o qualsevol altre traductor automàtic
>
> $ echo "o qualsevol altre traductor automàtic" | apertium -d .
> ca-es-anmor
> ^o/o<cnjcoo>$
> ^qualsevol/qualsevol<adj><mf><sg>/qualsevol<prn><tn><mf><sg>/qualsevol<det><ind><mf><sg>$
> ^altre/altre<adj><ind><m><sg>/altre<det><ind><m><sg>$
> ^traductor/traductor<n><m><sg>$ ^automàtic/automàtic<adj><m><sg>$^./.<sent>$
>
> $ echo "o qualsevol altre traductor automàtic" | apertium -d .
> ca-es-tagger
> ^o<cnjcoo>$ ^qualsevol<prn><tn><mf><sg>$ ^altre<det><ind><m><sg>$
> ^traductor<n><m><sg>$ ^automàtic<adj><m><sg>$^.<sent>$
>
> o cualquiera otro traductor automático
>
> --
>
> I think here it should choose 'qualsevol' (determiner) as opposed to the
> pronoun. But it could also be that I have an error in my Catalan. Could
> someone who knows Catalan/Spanish well check this out ?
>
> A couple of rule might be
>
> FORBID prn.tn + adj.ind
> FORBID prn.tn + det.ind
>
> I can't think of any examples of this ... but maybe someone else can :)
>
> Thanks !
>
> Fran
>
>
> ------------------------------------------------------------------------------
> Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
> Finally, a world-class log management solution at an even better price-free!
> Download using promo code Free_Logger_4_Dev2Dev. Offer expires
> February 28th, so secure your free ArcSight Logger TODAY!
> http://p.sf.net/sfu/arcsight-sfd2d
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff