Hi again,
and Good luck!
just some more comments below:

On Tue, Oct 9, 2012, at 15:14, [email protected] wrote:
> On Tue, Oct 09, 2012 at 09:41:41AM +0200, Per Tunedal wrote:
> > Hej Keld,
> > I liked your algo but had to think it over. After I've slept on, it a
> > few things got into my mind:
> > 
> > "My initial go on an algorithm is then: I found a homonym. 
> > Each of the homonyms have a placement in the meaning tree via its father
> > and mother relations."
> > 
> > Unfortunately, I've no idea what's the father relation. Maybe you should
> > follow only the mother relations?
> 
> The father relation is meant to discriminate between the same mother
> relations.
> So maybe it can be of help. I don't know. I take it into account to
> generalize 
> wordnet-like structures, there may be more than one relation from a given
> homonym
> And a general Apertium wordnet module and algoritm should be able
> to handle more than one upwards relation, In the monodix markup
> this could be then marked with a "rel" tag, and more
> "rel" tags may be present. I need input from people more in the know if
> this could be
> the recommended way to mark up such meaning relations in the monodix.

You might test this at a representative corpus and see if it adds much
to the accuracy of the selection. If not skip it!

> 
> > "Each of the above terms in the trees will then be recorded with the
> > distance in links 
> > to the specific homonym."
> > 
> > Seems alright, so far.
> > 
> > "If a term has been visited before then link count is modified
> > if the new link count is less - and also modified for all links above
> > this node."
> > 
> > I don't understand this.
> 
> there may be different paths from the homonym to a specific term. And one
> path
> would be shorter than another. if we find a path that is shorter than a
> previously found path,
> then the path to all terms above the found term would also be shorter,
> and thus the
> link count to those terms need to be adjusted too.
> 

OK

> 
> > "I then take a number of preceding and following words - say 5 preceding
> > and 5 following words." 
> > 
> > 1. You would only benefit from words loaded with sense, primarily nouns,
> > secondly verbs. The rest should be ignored.
> 
> yes, that could be an optimization. I would include adjectives and
> adverbiums too.
> At least exclude some very common words.

Again, test and see how much this info helps. Intuitively, I think you
could skip it. But I might be wrong!

> 
> > 2. It might be useful to search in adjacent sentences when looking for
> > related words. Look for 3 nouns before and 3 nouns after the ambiguous
> > word? If you find less in any direction, look for more in the other,
> > until you have the desired total (or as many you can find).
> > 
--snip---
> > 
> > "It is not OK to stop at the first match, there may be
> > shorter matches.... (Hmm, one could travel
> > the two trees to always secure shortest match - I think)"
> > 
> > Maybe there is a way to skip non-fruitful links before they are
> > completed, to speed things up? Compare to the shortest link found so
> > far?
> 
> Yes, you could stop the search if you are sure that further links would
> be longer than the
> one already found.

I believe that you'll find that you normally get a shortest link count
with a value within a small range. And further that you rarely get a
value less than a certain value. If you test on a sufficiently large
corpus, you could then calculate the probabilities. You will get a list
of diminishing probabilities, where you easily can see where the very
small values begins. Then you can stop searching when you reach that
value.

That is:

1. Stop searching if the link already is too long (that link cannot
possibly turn out to be the shortest, anyway).

2. Stop searching when you find a link short enough (it's unlikely that
you would find any shorter).

--snip--
> 
> yes. Anyway I think the algo we are making here is sufficiently simple
> and effective to
> give some experience to how well it could work.

Yes, absolutely. But maybe someone have tried something similar and have
some experience to share? What did work well and what didn't? What was
fruitful and what was not?

I recall someone wrote to the list that this have been tried before, and
it didn't help. Exactly what was tried and exactly what were the
results? What was the problem that sunk the project?

--snip--
> 
> > Further: Maybe it isn't necessary to find the shortest link, it might do
> > with a sufficiently short link? A link shorter than X. You might be able
> > to trim your algo when testing to find out a suitable X-factor. Maybe
> > the factor should be set differently for different languages or
> > corpuses? Your module might be shipped with a default factor that can be
> > adjusted by the developer of a language pair to the best fit. (Rational:
> > If the probability to find a shorter link is very low, don't try to find
> > any.)
> 
> I think you really should find the shortest link, to find the best match.

It might not be necessary, as I explained above. You need a "good
enough" algorithm, you cannot solve 100 % of the ambiguities anyway. But
give it a try! It might not cost that much in processor time. (= Skip my
suggestion 2 above.)

I can resume my opinion by: test and see what contributes much to the
accuracy and what doesn't. When you know, it's possible to fine tune the
algo for speed, skipping demanding steps that doesn't contribute much.
 
--snip--

> keld
> > 
> > Yours,
> > Per Tunedal

Yours,
Per Tunedal

> > 
--snip--
> > > >   On Tue, Sep 11, 2012, at 10:03, [2][email protected] wrote:
> > > >   > Hej Per
> > > >   >
> > > >   > I actually have about 49.000 swedish nouns from the SALDO project
> > > >   to add
> > > >   > to the swedish dix. I would just like some way to suppress
> > > >   overwriting
> > > >   > already existing working relations for homonyms.
> > > >   >
> > > >   --snip--
> > > >   >
> > > >   > Best regards
> > > >   > keld
> > > >   >
> > > >   >
> > > >   --snip__
> > > >   > [3]https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > > >   --------------------------------------------------------------------
> > > >   ----------
> > > >   Don't let slow site performance ruin your business. Deploy New Relic
> > > >   APM
> > > >   Deploy New Relic app performance management and know exactly
> > > >   what is happening inside your Ruby, Python, PHP, Java, and .NET app
> > > >   Try New Relic at no cost today and get our sweet Data Nerd shirt
> > > >   too!
> > > >   [4]http://p.sf.net/sfu/newrelic-dev2dev
> > > >   _______________________________________________
> > > >   Apertium-stuff mailing list
> > > >   [5][email protected]
> > > >   [6]https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > > > 
> > > > 
> > > > --
> > > > [7]Jacob Nordfalk
> > > > [8]javabog.dk
> > > > Androidudvikler og -underviser p?? [9]IHK og [10]Lund&Bendsen
> > > > 
> > > > -----------------------------------------------------------------------
> > > > -------
> > > > 
> > > > Don't let slow site performance ruin your business. Deploy New Relic
> > > > APM
> > > > 
> > > > Deploy New Relic app performance management and know exactly
> > > > 
> > > > what is happening inside your Ruby, Python, PHP, Java, and .NET app
> > > > 
> > > > Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> > > > 
> > > > [11]http://p.sf.net/sfu/newrelic-dev2dev
> > > > 
> > > > _______________________________________________
> > > > 
> > > > Apertium-stuff mailing list
> > > > 
> > > > [12][email protected]
> > > > 
> > > > [13]https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > > > 
> > > > References
> > > > 
> > > > 1. mailto:[email protected]
> > > > 2. mailto:[email protected]
> > > > 3. https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > > > 4. http://p.sf.net/sfu/newrelic-dev2dev
> > > > 5. mailto:[email protected]
> > > > 6. https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > > > 7. http://profiles.google.com/jacob.nordfalk
> > > > 8. http://javabog.dk/
> > > > 9. http://cv.ihk.dk/diplomuddannelser/itd/vf/MAU
> > > >   10. https://www.lundogbendsen.dk/undervisning/beskrivelse/LB1809/
> > > >   11. http://p.sf.net/sfu/newrelic-dev2dev
> > > >   12. mailto:[email protected]
> > > >   13. https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > > 
> > > > ------------------------------------------------------------------------------
> > > > Don't let slow site performance ruin your business. Deploy New Relic APM
> > > > Deploy New Relic app performance management and know exactly
> > > > what is happening inside your Ruby, Python, PHP, Java, and .NET app
> > > > Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> > > > http://p.sf.net/sfu/newrelic-dev2dev
> > > > _______________________________________________
> > > > Apertium-stuff mailing list
> > > > [email protected]
> > > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > > 
> > 
> > ------------------------------------------------------------------------------
> > Don't let slow site performance ruin your business. Deploy New Relic APM
> > Deploy New Relic app performance management and know exactly
> > what is happening inside your Ruby, Python, PHP, Java, and .NET app
> > Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> > http://p.sf.net/sfu/newrelic-dev2dev
> > _______________________________________________
> > Apertium-stuff mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> 
> ------------------------------------------------------------------------------
> Don't let slow site performance ruin your business. Deploy New Relic APM
> Deploy New Relic app performance management and know exactly
> what is happening inside your Ruby, Python, PHP, Java, and .NET app
> Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> http://p.sf.net/sfu/newrelic-dev2dev
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to