Hi, Just a thought: couldn't this kind of rule just as well be implemented in the TSX-file that's used to train the tagger? In that case, retraining the tagger might do the trick as well. Yours, Per Tunedal
On Fri, Mar 4, 2016, at 09:41, Kevin Brubeck Unhammer wrote: > Per Tunedal <[email protected]> čálii: > > > 'ta en blå kon' (=take a blue cone) to danish. 'kon' might be the > > indefinite form of 'kon' (= cone) or the definite form of 'ko' (= the > > cow). We have: > > > > (kon→ kon<n>/ko<n>) > > > > Translating the whole sentence would give us: > > > > tag en blå kegle / tag en blå koen (= take a blue cone / take a blue the > > cow) > > > > Wouldn't that be quite revealing in many cases? In this case e.g. a > > statistical language model could easily separate the wheat from the > > chaff. > > That example argues against your point – here the source language has > two analyses of "kon", with different ind/def taggings (as it should). > > This is not a lexical selection problem, but a morphological > disambiguation problem. > > It took me all of five minutes to write a CG rule to select indefinite > for nouns after indefinite determiners: > > LIST IndA = (adj ind) (adj comp) ; > SET NotIndA = (*) - IndA ; > REMOVE:en-blå-kon N + Def IF (0 N + Ind) (*-1 Det + Ind CBARRIER NotIndA) > ; > > and a quick corpus diff seems to show it generalises well: > > http://sprunge.us/hhbf?diff > > -- > Kevin Brubeck Unhammer > > GPG: 0x766AC60C > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > Email had 1 attachment: > + signature.asc > 1k (application/pgp-signature) ------------------------------------------------------------------------------ Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://makebettercode.com/inteldaal-eval _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
