El ds 16 de 06 de 2012 a les 16:48 +0200, en/na Mikel Forcada va
escriure:
> Thanks Fran!
> >
> > http://www.lrec-conf.org/proceedings/lrec2012/pdf/1075_Paper.pdf
> Interesting! But see below for a critical view.
> >
> > 20 hours (very little time!) writing disambiguation rules gives
> > substantial improvements.
> I have added the reference to page 
> http://wiki.apertium.org/wiki/Constraint_Grammar (External Links).
> 
> I just want to call the attention to the fact that some of the rules 
> used by these authors could be written in "canonical", CG3-free Apertium 
> as "forbid" rules in .tsx files.
> 
> For instance, the rule
> 
> REMOVE (DET) IF (1C (VFIN));
> 
> corresponds to forbid rules we use in .tsx files (see, e.g. 
> apertium-es-ca.es.tsx) such as:
> 
> <forbid>
> <!-- ... -->
> <label-sequence>
> <label-item label="DETM"/>
> <label-item label="VLEXPFCI"/>
> </label-sequence>
> <!-- ... -->
> </forbid>
> 
>   We have also (historically) found that investing some time on .tsx 
> rules improves taggers measurably.
> > Might help us get around tagging errors like:
> >
> > $ echo "Avui no veig el sol." | apertium -d . ca-en-tagger
> > ^Avui<adv>$ ^no<adv>$ ^veure<vblex><pri><p1><sg>$ ^el<det><def><m><sg>$
> > ^sol<adj><m><sg>$^.<sent>$^.<sent>$
> Fran, what would be a reasonable "forbidding" rule here that repairs 
> this error but does not break things somewhere else?
> >
> > $ echo "Why does she do that?" | apertium -d . en-ca-tagger
> > ^Why<adv><itg>$ ^do<vbdo><pri><p3><sg>$ ^prpers<prn><subj><p3><f><sg>$
> > ^do<vbdo><pres>$ ^that<cnjsub>$^?<sent>$^.<sent>$
> 
> I think this could easily be dealt with in "pure", "canonical" Apertium 
> using a simple forbid rule in the .tsx file. The fact that booboos like 
> this one pass on to the transfer file is a clear indication that the 
> .tsx file in apertium-en-ca needs love, rather than justifying the need 
> for introducing a non-canonical CG3 module. I have also added a quick 
> section in http://wiki.apertium.org/wiki/Constraint_Grammar to that effect.

I'll reply in general to the post tomorrow, but wanted to highlight this
one quickly: 

Yes the cnjsub + sent could be made into a forbid/enforce rule, but the
pri + prn.subj + pres rule could not. 

Also: I don't argue for the inclusion of constraint grammar (CG3) rather
for the inclusion of a rule-based disambiguation module for Apertium
(like the one in Hrvoje's project) to cover rules which cannot be taken
care of with bigrams.

Fran


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to