On Sun, Jul 03, 2011 at 10:21:55PM +0100, Jimmy O'Regan wrote: > 2011/7/3 Keld Jørn Simonsen <[email protected]>: > > So that person actually understood what I meant the first time - good to > > know that there is at least one person (plus my mother) that understands > > me - although the understanding may crumble over time. > > Context is wonderful. I did say it wouldn't be done in a hurry, and > nobody else has expressed an interest in it since then. If you want to > try yourself, take a look at TaggerWord::discardOnAmbiguity in > tagger_word.cc, otherwise you'll have to continue waiting.
Yes, you said: > Without retraining the tagger, there's no way to do that. There are > preference rules, but those only filter on tags. I think it might be > useful to extend the tagger to have a mechanism to make certain tag > choices for specific lemmas, and not too difficult to implement, based > on the existing preference rules, but it's not going to be done in a > hurry. I put emphasis on "not to difficult to implement". What are your thoughts? Then I could have a look. I was actually thinking of some more complex things also, and if they would be almost as easy to implement, then I would go for the full monty. My further ideas were: - discardOnAmbiguity based on allowed grammatical rules - discardOnAmbiguity based on number of appearances - discardOnAmbiguity based on shortest distance for a wordnet like graph for the surrounding say 10 words. Best regards keld ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
