On Tue, Jul 05, 2011 at 02:02:24PM +0100, Jimmy O'Regan wrote: > 2011/7/5 Keld Jørn Simonsen <[email protected]>: > > Hi > > > > I would like to attach attributes to lemmas. Only a few but maybe there > > could be more, so a kind of introducing an attribute name would be nice, > > instead of having a predefined set of attribute names.. > > > > I believe there are already lemma attributes, such as the word class of > > the lemma: noun, verb, adjective, adverb etc. > > > > what I have in mind is to attach data from wordnet, such as sense, > > hypernym, hyponum, holonym, meromnym, and also combine it with the > > Swedish SALDO attributes of father and mother relations. > > > > The idea is then to choose a sense of a homonym based on the shortest > > distance to maybe the previous and following five words. > > > > a lemma may have more than one sense. Eg 'nut' may mean several things > > such as the offspring of a plant, nuts and bolts, and testicles. > > > > Is this easy to do? How do I do it? > > The question you're not asking is 'is it worth doing?'
I think I have talked about some of the tings before. And then I did some googeling, and what I seem to remeber is a 60 % improvement reported. IMHO worth investigating. > Mixing Wordnet into MT was quite popular in the late 90s. The only > mention I've seen of wordnet in relation to MT published in the past > 10 years was a call for a round table discussion on the topic of 'we > all know wordnet is not useful for MT, but is it because we're not > using it right?'. > > Wordnet was mostly built by lexicographers and ontologists, and > continues to be an excellent resource for those uses. Aside from that, > in NLP, the only place I'm aware of its use is in Word Sense > Disambiguation, and even there, it's mainly used as a standard index > of senses, and not for its word relationships. And even there, it's > starting to be overtaken by Wikipedia. > > (As I have a nasty habit of coming across as mocking, whether I mean > to be or not, I'd like to add that the reason I know this is because I > had similar ideas and wasted a few months on it.) What did you do? Was it with apertium? And did you produce code? What was the meagre results? Best regards keld ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
