On Sun, Jul 03, 2011 at 10:21:55PM +0100, Jimmy O'Regan wrote:
> 2011/7/3 Keld Jørn Simonsen <[email protected]>:
> > So that person actually understood what I meant the first time - good to
> > know that there is at least one person (plus my mother) that understands
> > me - although the understanding may crumble over time.
> 
> Context is wonderful. I did say it wouldn't be done in a hurry, and
> nobody else has expressed an interest in it since then. If you want to
> try yourself, take a look at TaggerWord::discardOnAmbiguity in
> tagger_word.cc, otherwise you'll have to continue waiting.

Yes, you said:

> Without retraining the tagger, there's no way to do that. There are
> preference rules, but those only filter on tags. I think it might be
> useful to extend the tagger to have a mechanism to make certain tag
> choices for specific lemmas, and not too difficult to implement, based
> on the existing preference rules, but it's not going to be done in a
> hurry.

I put emphasis on "not to difficult to implement". What are your
thoughts? Then I could have a look. I was actually thinking of some more
complex things also, and if they would be almost as easy to implement,
then I would go for the full monty.

My further ideas were:
- discardOnAmbiguity based on allowed grammatical rules
- discardOnAmbiguity based on number of appearances
- discardOnAmbiguity based on shortest distance for a wordnet like graph
  for the surrounding say 10 words.

Best regards
keld

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to