Hi again,
Obviously, CG would be quite helpful for disambiguation when doing
lemmatisation. Would it be complicated to add an option to use CG (if
present)? Using the cg-rules for the language would probable remove some
more ambiguity.

Looking at the page http://wiki.apertium.org/wiki/Lemmatisation . What
does the command actually do:

     $ echo "Den här är en test." | apertium -d . swe-tagger | cg-proc
     guesser.bin  | sed 's/<[^>]\+>//g' | cg-proc -n guesser.bin 

     Will give lemmatised output where the tokens are encased in ^ and
     $, and ambiguous stems/lemmas are given separated by '/' 

Yours,
Per Tunedal

On Fri, Mar 4, 2016, at 09:41, Kevin Brubeck Unhammer wrote:
> Per Tunedal <[email protected]> čálii:
> 
> > 'ta en blå kon' (=take a blue cone) to danish. 'kon' might be the
> > indefinite form of 'kon' (= cone) or the definite form of 'ko' (= the
> > cow). We have:
> >
> >  (kon→ kon<n>/ko<n>)
> >
> > Translating the whole sentence would give us:
> >
> > tag en blå kegle / tag en blå koen (= take a blue cone / take a blue the
> > cow)
> >
> > Wouldn't that be quite revealing in many cases? In this case e.g. a
> > statistical language model could easily separate the wheat from the
> > chaff.
> 
> That example argues against your point – here the source language has
> two analyses of "kon", with different ind/def taggings (as it should).
> 
> This is not a lexical selection problem, but a morphological
> disambiguation problem.
> 
> It took me all of five minutes to write a CG rule to select indefinite
> for nouns after indefinite determiners:
> 
> LIST IndA = (adj ind) (adj comp) ;
> SET NotIndA = (*) - IndA ;
> REMOVE:en-blå-kon N + Def IF (0 N + Ind) (*-1 Det + Ind CBARRIER NotIndA)
> ;
> 
> and a quick corpus diff seems to show it generalises well:
> 
> http://sprunge.us/hhbf?diff
> 
> -- 
> Kevin Brubeck Unhammer
> 
> GPG: 0x766AC60C
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)

------------------------------------------------------------------------------
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to