Re: [Apertium-stuff] tagging error es-ca

Mikel Forcada Thu, 27 Jan 2011 00:47:12 -0800

Alternative: Instead of fiddling with "qualsevol"'s part of speech, 
there is another solution. If the problem is only with


"qualsevol altre"
"qualsevol altra"
"qualssevol altres"

Just add them as multiwords and name them "det" as we do with "el meu", 
"la meva", etc. How does that sound?

By the way, whatever it is done, it should be the same with Spanish 
"cualquier otro", "cualesquier otros", etc.

Mikel


"qOn 01/26/2011 10:46 PM, Jimmy O'Regan wrote:
> On 26 January 2011 21:23, Francis Tyers<[email protected]>  wrote:
>> El dc 26 de 01 de 2011 a les 21:19 +0000, en/na Jimmy O'Regan va
>> escriure:
>>> On 26 January 2011 11:59, Francis Tyers<[email protected]>  wrote:
>>>> Hey all,
>>>>
>>>> Translating some text from Catalan to Spanish I get a tagging error:
>>>>
>>>> --
>>>>
>>>> o qualsevol altre traductor automàtic
>>>>
>>>> $ echo "o qualsevol altre traductor automàtic" | apertium -d .
>>>> ca-es-anmor
>>>> ^o/o<cnjcoo>$
>>>> ^qualsevol/qualsevol<adj><mf><sg>/qualsevol<prn><tn><mf><sg>/qualsevol<det><ind><mf><sg>$
>>>>  ^altre/altre<adj><ind><m><sg>/altre<det><ind><m><sg>$ 
>>>> ^traductor/traductor<n><m><sg>$ 
>>>> ^automàtic/automàtic<adj><m><sg>$^./.<sent>$
>>>>
>>>> $ echo "o qualsevol altre traductor automàtic" | apertium -d .
>>>> ca-es-tagger
>>>> ^o<cnjcoo>$ ^qualsevol<prn><tn><mf><sg>$ ^altre<det><ind><m><sg>$
>>>> ^traductor<n><m><sg>$ ^automàtic<adj><m><sg>$^.<sent>$
>>>>
>>>> o cualquiera otro traductor automático
>>>>
>>>> --
>>>>
>>>> I think here it should choose 'qualsevol' (determiner) as opposed to the
>>>> pronoun. But it could also be that I have an error in my Catalan. Could
>>>> someone who knows Catalan/Spanish well check this out ?
>>>>
>>>> A couple of rule might be
>>>>
>>>>   FORBID prn.tn + adj.ind
>>>>   FORBID prn.tn + det.ind
>>>>
>>> Can't work. The forbid rules are not rules, per se, they just insert a
>>> number approaching 0 as the probability of that bigram (which is
>>> P(w2|w1), while you're talking about P(w1|w2)... FWIW, in the cs-pl
>>> draft, I'd put something along the lines of 'the Markov assumption
>>> that a word can be disambiguated solely in terms of left context does
>>> not always hold true', but I was told that was a 'bold statement' and
>>> left it out).
>> Eckhard says stuff like that all the time, maybe you need to move to
>> Denmark ?
>>
> Ah... ok, now I see why it could sound 'bold'. No, in a bigram
> setting, P(w2|w1) is much more reasonable, and for languages like
> English trigrams based on P(w3|w1,w2) are fairly reasonable too, but
> for Czech (etc.) P(w2|w1,w3) is much better (there are many
> situations, especially with soft-stemmed adjectives, where the
> following word is often the only disambiguating context). Hunpos, btw,
> is configurable for either.
>
>> Also, what do you think of adding 'qualsevol' as a predet ?
>>
> Seems reasonable. I'm relatively sure that would not be a new
> ambiguity class, but it'd be worth checking.
>


-- 
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326


------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] tagging error es-ca

Reply via email to