Mireia,
I see. The problem is that "altre" should also be an adjective, so that
an adjective can follow a determiner. "l'altre", etc. could be
determiners. It makes sense: only one determiner per noun phrase (as in
most languages)
qualsevol altre: det adj
l'altre: det
el meu altre: det adj
How does that sound?
Making decisions about parts of speech is not easy but it is worth the
effort. For instance, I remember the turmoil when in interNOSTRUM we
decided that "como" would not be a "conjunction" but rather a
"preposition", as "like" is in the Oxford Learner's Advanced Dictionary.
Calling it a conjunction didn't help much, but calling it a preposition
was important when tagging and deciding its translation (como can be
prep, rel.adv and a verb).
For instance, I don't know what we decided with "fa" in "fa tres anys
que vinc" ("I come since three years ago"). One could say it is the verb
"fer" (to do), as it is the case with "enguany Mikel fa 48 anys" ("Mikel
turns 48 this year") and "que vinc" (that I come) is the subject, but
what about "va venir fa dos anys" (he came two years ago") o "canta des
de fa dos anys" ("he sings since two years ago") or "fins fa dos anys"
("up to two years ago")? A personal verb cannot follow a preposition, so
I think we "des de fa" and "fins fa" were christened multiword prepositions.
Cheers
Mikel
On 01/27/2011 05:11 PM, Mireia wrote:
> Hola,
> El dj 27 de 01 de 2011 a les 08:03 +0100, en/na Mikel Forcada va
> escriure:
>> Alternative: Instead of fiddling with "qualsevol"'s part of speech,
>> there is another solution. If the problem is only with
>>
>> "qualsevol altre"
>> "qualsevol altra"
>> "qualssevol altres"
>>
>> Just add them as multiwords and name them "det" as we do with "el meu",
>> "la meva", etc. How does that sound?
> It sounds good. But there are some other cases:
> echo "Els meus altres projectes. El meu altre projecte" | apertium ca-es
> Mis otros proyectos
> El mío otro proyecto
>
> apertium -d . ca-es-anmor
> ^Els meus/El meu<det><pos><m><pl>/El meu<prn><tn><pos><m><pl>$
> ^altres/altre<adj><ind><mf><pl>/altre<det><ind><mf><pl>$
> ^projectes/projecte<n><m><pl>/projectar<vblex><pri><p2><sg>/projectar<vblex><prs><p2><sg>$
> ^El meu/El meu<det><pos><m><sg>/El meu<prn><tn><pos><m><sg>$
> ^altre/altre<adj><ind><m><sg>/altre<det><ind><m><sg>$
> ^projecte/projecte<n><m><sg>/projectar<vblex><pri><p1><sg>/projectar<vblex><prs><p3><sg>/projectar<vblex><prs><p1><sg>/projectar<vblex><imp><p3><sg>$^./.<sent>$
>
> apertium -d . ca-es-tagger
> ^El meu<det><pos><m><pl>$ ^altre<adj><ind><mf><pl>$ ^projecte<n><m><pl>$
> ^El meu<prn><tn><pos><m><sg>$ ^altre<det><ind><m><sg>$
> ^projecte<n><m><sg>$^.<sent>$
>
>
> In some cases the tagger chooses the combination det + adj.ind, whereas
> in others it chooses prn + det.ind. The first one works in the
> translation into Spanish, the second one doesn't.
>
> Another example:
> apertium ca-es
> cap altre dia
> ninguno otro día
>
> It seems the problem is only when "altre" appears as second determiner.
> The dictionary has already some combinations with "altre": "un altre"
> and "molts altres" are multiwords, with two entries, one for pronoun and
> the other for determiner.
> So maybe it would be good to enter "el meu altre" (det) , "qualsevol
> altre" (det and prn) and "cap altre" (det and prn) (with all the
> variations:
>
> "qualsevol altre"
> "qualsevol altra"
> "qualssevol altres"
>
> "el meu altre" - the only one with tagger error
> "la meva altra" (no tagging error)
> "els meus altres" (no tagging error)
> "les meves altres" (no tagging error)
>
> "cap altre"
> "cap altra" (no tagging error)
>
> What do you think?
>
>
>
>> By the way, whatever it is done, it should be the same with Spanish
>> "cualquier otro", "cualesquier otros", etc.
>>
>> Mikel
>>
>>
>> "qOn 01/26/2011 10:46 PM, Jimmy O'Regan wrote:
>>> On 26 January 2011 21:23, Francis Tyers<[email protected]> wrote:
>>>> El dc 26 de 01 de 2011 a les 21:19 +0000, en/na Jimmy O'Regan va
>>>> escriure:
>>>>> On 26 January 2011 11:59, Francis Tyers<[email protected]> wrote:
>>>>>> Hey all,
>>>>>>
>>>>>> Translating some text from Catalan to Spanish I get a tagging error:
>>>>>>
>>>>>> --
>>>>>>
>>>>>> o qualsevol altre traductor automàtic
>>>>>>
>>>>>> $ echo "o qualsevol altre traductor automàtic" | apertium -d .
>>>>>> ca-es-anmor
>>>>>> ^o/o<cnjcoo>$
>>>>>> ^qualsevol/qualsevol<adj><mf><sg>/qualsevol<prn><tn><mf><sg>/qualsevol<det><ind><mf><sg>$
>>>>>> ^altre/altre<adj><ind><m><sg>/altre<det><ind><m><sg>$
>>>>>> ^traductor/traductor<n><m><sg>$
>>>>>> ^automàtic/automàtic<adj><m><sg>$^./.<sent>$
>>>>>>
>>>>>> $ echo "o qualsevol altre traductor automàtic" | apertium -d .
>>>>>> ca-es-tagger
>>>>>> ^o<cnjcoo>$ ^qualsevol<prn><tn><mf><sg>$ ^altre<det><ind><m><sg>$
>>>>>> ^traductor<n><m><sg>$ ^automàtic<adj><m><sg>$^.<sent>$
>>>>>>
>>>>>> o cualquiera otro traductor automático
>>>>>>
>>>>>> --
>>>>>>
>>>>>> I think here it should choose 'qualsevol' (determiner) as opposed to the
>>>>>> pronoun. But it could also be that I have an error in my Catalan. Could
>>>>>> someone who knows Catalan/Spanish well check this out ?
>>>>>>
>>>>>> A couple of rule might be
>>>>>>
>>>>>> FORBID prn.tn + adj.ind
>>>>>> FORBID prn.tn + det.ind
>>>>>>
>>>>> Can't work. The forbid rules are not rules, per se, they just insert a
>>>>> number approaching 0 as the probability of that bigram (which is
>>>>> P(w2|w1), while you're talking about P(w1|w2)... FWIW, in the cs-pl
>>>>> draft, I'd put something along the lines of 'the Markov assumption
>>>>> that a word can be disambiguated solely in terms of left context does
>>>>> not always hold true', but I was told that was a 'bold statement' and
>>>>> left it out).
>>>> Eckhard says stuff like that all the time, maybe you need to move to
>>>> Denmark ?
>>>>
>>> Ah... ok, now I see why it could sound 'bold'. No, in a bigram
>>> setting, P(w2|w1) is much more reasonable, and for languages like
>>> English trigrams based on P(w3|w1,w2) are fairly reasonable too, but
>>> for Czech (etc.) P(w2|w1,w3) is much better (there are many
>>> situations, especially with soft-stemmed adjectives, where the
>>> following word is often the only disambiguating context). Hunpos, btw,
>>> is configurable for either.
>>>
>>>> Also, what do you think of adding 'qualsevol' as a predet ?
>>>>
>>> Seems reasonable. I'm relatively sure that would not be a new
>>> ambiguity class, but it'd be worth checking.
>>>
>>
>
>
> ------------------------------------------------------------------------------
> Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
> Finally, a world-class log management solution at an even better price-free!
> Download using promo code Free_Logger_4_Dev2Dev. Offer expires
> February 28th, so secure your free ArcSight Logger TODAY!
> http://p.sf.net/sfu/arcsight-sfd2d
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
--
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes Informàtics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff