Hej Keld,
I liked your algo but had to think it over. After I've slept on, it a
few things got into my mind:

"My initial go on an algorithm is then: I found a homonym. 
Each of the homonyms have a placement in the meaning tree via its father
and mother relations."

Unfortunately, I've no idea what's the father relation. Maybe you should
follow only the mother relations?

"Each of the above terms in the trees will then be recorded with the
distance in links 
to the specific homonym."

Seems alright, so far.

"If a term has been visited before then link count is modified
if the new link count is less - and also modified for all links above
this node."

I don't understand this.

"I then take a number of preceding and following words - say 5 preceding
and 5 following words." 

1. You would only benefit from words loaded with sense, primarily nouns,
secondly verbs. The rest should be ignored.

2. It might be useful to search in adjacent sentences when looking for
related words. Look for 3 nouns before and 3 nouns after the ambiguous
word? If you find less in any direction, look for more in the other,
until you have the desired total (or as many you can find).

"For each of these words I travel up in the hierachy both of the
father and mother branches."

As I've said, maybe you could skip the father branch.

"We need to follow all branches - there may be one branch
that is shorter than a previous match, both on the homonym
side and the surrounding word side."

Seems OK.

"The shortest distance between the specific homonym
and the surrounding word is then the link count from the specific
homonym to the common term plus the surrounding word distance in links
to the common term."

OK

"It is not OK to stop at the first match, there may be
shorter matches.... (Hmm, one could travel
the two trees to always secure shortest match - I think)"

Maybe there is a way to skip non-fruitful links before they are
completed, to speed things up? Compare to the shortest link found so
far?

Maybe there are other ways too. If you test your algo you might find a
pattern of links likely to be less fruitful than others. Or someone
might come up with a theoretical way to figure it out?

Maybe there are similar algos around for related applications, like
spell checking, statistical translation (tree model or even factored
translation, look at Moses), speech recognition or even artificial
intelligence? And you mentioned finding the shortest way. Someone might
have an idea of where to look for algos? There might be some open source
code to copy or be inspired by.
 
"Then add up all link counts for all surrounding words, and pick the
homonym
that has the smallest total counts."

OK. What if you instead skipped all links that are not shorter than the
shortest link already found? There isn't any need to save a table of all
counts, if not for research or debugging.

Further: Maybe it isn't necessary to find the shortest link, it might do
with a sufficiently short link? A link shorter than X. You might be able
to trim your algo when testing to find out a suitable X-factor. Maybe
the factor should be set differently for different languages or
corpuses? Your module might be shipped with a default factor that can be
adjusted by the developer of a language pair to the best fit. (Rational:
If the probability to find a shorter link is very low, don't try to find
any.)

BTW Just like you I'm into this just for the fun of it. I will only work
with things that are of great interests to me. Primarily, I like to
solve problems. Or help others to solve theirs.

Yours,
Per Tunedal

On Mon, Oct 8, 2012, at 20:51, [email protected] wrote:
> Hej Per
> 
> Vad tycker du om min algoritm på wordnet data?
> 
> Hälsningar
> keld
> 
--snip---
> > 
> > 
> > 
> > 2012/10/7 Per Tunedal <[1][email protected]>
> > 
> >   Hi again,
> >   maybe this is feasible now, in the light of the possibility to trim
> >   dictionaries?
> >   Otherwise you might add the words to a copy of the Swedish
> >   dictionnary
> >   and give it an explanatory suffix. And I might somehow find out a
> >   way to
> >   adjust it.
> >   Actually it might be better to add it to the increased Swedish
> >   dictionary from the pair Islandic (is) - Swedish (se) (or was it
> >   se-is?), instead. I suppose it wouldn't do any harm if I tried that
> >   dictionary for Swedish and Danish. The main problem right now is
> >   that
> >   there are much more words in the Danish, than in the Swedish
> >   dictionary.
> >   Yours,
> >   Per Tunedal
> >   On Tue, Sep 11, 2012, at 10:03, [2][email protected] wrote:
> >   > Hej Per
> >   >
> >   > I actually have about 49.000 swedish nouns from the SALDO project
> >   to add
> >   > to the swedish dix. I would just like some way to suppress
> >   overwriting
> >   > already existing working relations for homonyms.
> >   >
> >   --snip--
> >   >
> >   > Best regards
> >   > keld
> >   >
> >   >
> >   --snip__
> >   > [3]https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> >   --------------------------------------------------------------------
> >   ----------
> >   Don't let slow site performance ruin your business. Deploy New Relic
> >   APM
> >   Deploy New Relic app performance management and know exactly
> >   what is happening inside your Ruby, Python, PHP, Java, and .NET app
> >   Try New Relic at no cost today and get our sweet Data Nerd shirt
> >   too!
> >   [4]http://p.sf.net/sfu/newrelic-dev2dev
> >   _______________________________________________
> >   Apertium-stuff mailing list
> >   [5][email protected]
> >   [6]https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > 
> > 
> > --
> > [7]Jacob Nordfalk
> > [8]javabog.dk
> > Androidudvikler og -underviser p?? [9]IHK og [10]Lund&Bendsen
> > 
> > -----------------------------------------------------------------------
> > -------
> > 
> > Don't let slow site performance ruin your business. Deploy New Relic
> > APM
> > 
> > Deploy New Relic app performance management and know exactly
> > 
> > what is happening inside your Ruby, Python, PHP, Java, and .NET app
> > 
> > Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> > 
> > [11]http://p.sf.net/sfu/newrelic-dev2dev
> > 
> > _______________________________________________
> > 
> > Apertium-stuff mailing list
> > 
> > [12][email protected]
> > 
> > [13]https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > 
> > References
> > 
> > 1. mailto:[email protected]
> > 2. mailto:[email protected]
> > 3. https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > 4. http://p.sf.net/sfu/newrelic-dev2dev
> > 5. mailto:[email protected]
> > 6. https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > 7. http://profiles.google.com/jacob.nordfalk
> > 8. http://javabog.dk/
> > 9. http://cv.ihk.dk/diplomuddannelser/itd/vf/MAU
> >   10. https://www.lundogbendsen.dk/undervisning/beskrivelse/LB1809/
> >   11. http://p.sf.net/sfu/newrelic-dev2dev
> >   12. mailto:[email protected]
> >   13. https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> > ------------------------------------------------------------------------------
> > Don't let slow site performance ruin your business. Deploy New Relic APM
> > Deploy New Relic app performance management and know exactly
> > what is happening inside your Ruby, Python, PHP, Java, and .NET app
> > Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> > http://p.sf.net/sfu/newrelic-dev2dev
> > _______________________________________________
> > Apertium-stuff mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to