Hi

I am a computer scientist, not a linguist. So I am not sure I fully understand 
the
implications of the mother and father relations in wordnet data. As which is 
best and so on.
And I do not have linguist literature links ready at hand.

I only understand them as relations. And then the idea is: if the distance 
between 
two words is shorter, then the two words are more related. 

My initial go on an algorithm is then: I found a homonym. 
Each of the homonyms have a placement in the meaning tree via its father and 
mother relations.
Each of the above terms in the trees will then be recorded with the distance in 
links 
to the specific homonym.  If a term has been visited before then link count is 
modified
if the new link count is less - and also modified for all links above this node.

I then take a number of preceding and following words - say 5 preceding
and 5 following words. For each of these words I travel up in the hierachy both 
of the
father and mother branches. We need to follow all branches - there may be one 
branch
that is shorter than a previous match, both on the homonym
side and the surrounding word side. The shortest distance between the specific 
homonym
and the surrounding word is then the link count from the specific
homonym to the common term plus the surrounding word distance in links
to the common term. It is not OK to stop at the first match, there may be
shorter matches.... (Hmm, one could travel
the two trees to always secure shortest match - I think)
 
Then add up all link counts for all surrounding words, and pick the homonym
that has the smallest total counts.

I think this is much like finding the shortest distance in a map.
In a map you have the coordinates, longitude, lattitude, and I think we do not
have a similar helping data for our search. 

I don't know how wild the searches can grow. Maybe exclude some very common 
words.
The tree of coarse grows but becomes narrower as we approcah the root of the 
tree.
There are probably not that many homonyms in a text so it is probably OK
to use some time to resolve them.

best regards
keld


On Mon, Oct 08, 2012 at 02:32:24PM +0200, Per Tunedal wrote:
> Hi,
> Interesting. I've done some google-ing on word sense disambiguation but
> I still don't fully understand what is "shortest distance to surrounding
> words". Please explain or give me some links!
> Yours,
> Per Tunedal
> 
> On Mon, Oct 8, 2012, at 12:48, [email protected] wrote:
> > Oh, well, We have open source wordnet ressources for Danish, Swedish and
> > English,
> > at least, (these are languages that I am primarily interested in.
> > 
> > And yes, there are different opinions on how effective wordnet ressources
> > are - but anyway some of us are just here for the fun of it, and willing
> > to
> > try out some experiments. My problem is just that I do not know the 
> > intrinsics of Apertium, so I don't know where to start to build
> > an extra module. And then I have problems finding the time...
> > 
> > I don't think it would require big ressources. The wordnet data is there
> > already for Danish and Swedish (da and sv - I use ISO 639 codes here). 
> > So it is just to make the monodix'es with adequate representation of the
> > father and morher
> > relations form the Swedish and Danish wordnets, and then to make a module
> > that for homonyms chose the best fit form some chriteria,, say shortest
> > distance 
> > to surrounding words for the homonym. It should be less than 100 lines
> > for such a module.
> > 
> > Best regards
> > keld
> > 
> > On Sun, Oct 07, 2012 at 11:02:51PM +0000, Francis Tyers wrote:
> > > I'm not working on anything regarding WordNet. The problem with WordNet
> > > is that few languages have one -- at least not a free one, and it takes
> > > quite a bit of effort to make, and is of doubtful use anyway (the sense
> > > distinctions tend to be too fine for good lexical selection). -- I don't
> > > remember the last time I read a paper where they actually got WordNet to
> > > help improve an MT system.
> > > 
> > > I'm more interested in methods requiring few resources. For example,
> > > monolingual corpora, small parallel corpora.
> > > 
> > > Fran
> > > 
> > > El dl 08 de 10 de 2012 a les 00:53 +0200, en/na [email protected] va
> > > escriure:
> > > > Hmm, I could certainly do something.
> > > > 
> > > > But I would like to retain some of the extra info from the wordnet-like 
> > > > terms relations from the SALDO project, like the mother and father 
> > > > relations.
> > > > 
> > > > I would then later like to make some selection module based on the
> > > > relations, something like for homonyms choose the homonym with the 
> > > > shortest distance
> > > > to the surrounding maybe 10 words. I think we could reduce errors to 
> > > > 1/3 of the
> > > > current situation.
> > > > 
> > > > I don't know if this is related to Fran's Ph.D. but it could look like 
> > > > it.
> > > > 
> > > > Best regards
> > > > keld
> > > > 
> > > > 
> > > > On Sun, Oct 07, 2012 at 10:20:31PM +0200, Per Tunedal wrote:
> > > > > Hi again,
> > > > > maybe this is feasible now, in the light of the possibility to trim
> > > > > dictionaries?
> > > > > 
> > > > > Otherwise you might add the words to a copy of the Swedish dictionnary
> > > > > and give it an explanatory suffix. And I might somehow find out a way 
> > > > > to
> > > > > adjust it.
> > > > > Actually it might be better to add it to the increased Swedish
> > > > > dictionary from the pair Islandic (is) - Swedish (se) (or was it
> > > > > se-is?), instead. I suppose it wouldn't do any harm if I tried that
> > > > > dictionary for Swedish and Danish. The main problem right now is that
> > > > > there are much more words in the Danish, than in the Swedish 
> > > > > dictionary.
> > > > > 
> > > > > Yours,
> > > > > Per Tunedal
> > > > > 
> > > > > On Tue, Sep 11, 2012, at 10:03, [email protected] wrote:
> > > > > > Hej Per
> > > > > > 
> > > > > > I actually have about 49.000 swedish nouns from the SALDO project 
> > > > > > to add
> > > > > > to the swedish dix. I would just like some way to suppress 
> > > > > > overwriting
> > > > > > already existing working relations for homonyms.
> > > > > > 
> > > > > --snip--
> > > > > > 
> > > > > > Best regards
> > > > > > keld
> > > > > > 
> > > > > > 
> > > > > --snip__
> > > > > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > > > > 
> > > > > ------------------------------------------------------------------------------
> > > > > Don't let slow site performance ruin your business. Deploy New Relic 
> > > > > APM
> > > > > Deploy New Relic app performance management and know exactly
> > > > > what is happening inside your Ruby, Python, PHP, Java, and .NET app
> > > > > Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> > > > > http://p.sf.net/sfu/newrelic-dev2dev
> > > > > _______________________________________________
> > > > > Apertium-stuff mailing list
> > > > > [email protected]
> > > > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > > > 
> > > > ------------------------------------------------------------------------------
> > > > Don't let slow site performance ruin your business. Deploy New Relic APM
> > > > Deploy New Relic app performance management and know exactly
> > > > what is happening inside your Ruby, Python, PHP, Java, and .NET app
> > > > Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> > > > http://p.sf.net/sfu/newrelic-dev2dev
> > > > _______________________________________________
> > > > Apertium-stuff mailing list
> > > > [email protected]
> > > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > > 
> > > 
> > > 
> > > 
> > > ------------------------------------------------------------------------------
> > > Don't let slow site performance ruin your business. Deploy New Relic APM
> > > Deploy New Relic app performance management and know exactly
> > > what is happening inside your Ruby, Python, PHP, Java, and .NET app
> > > Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> > > http://p.sf.net/sfu/newrelic-dev2dev
> > > _______________________________________________
> > > Apertium-stuff mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > 
> > ------------------------------------------------------------------------------
> > Don't let slow site performance ruin your business. Deploy New Relic APM
> > Deploy New Relic app performance management and know exactly
> > what is happening inside your Ruby, Python, PHP, Java, and .NET app
> > Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> > http://p.sf.net/sfu/newrelic-dev2dev
> > _______________________________________________
> > Apertium-stuff mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> ------------------------------------------------------------------------------
> Don't let slow site performance ruin your business. Deploy New Relic APM
> Deploy New Relic app performance management and know exactly
> what is happening inside your Ruby, Python, PHP, Java, and .NET app
> Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> http://p.sf.net/sfu/newrelic-dev2dev
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to