On 9 October 2012 14:14,  <[email protected]> wrote:
> On Tue, Oct 09, 2012 at 09:41:41AM +0200, Per Tunedal wrote:
>> Hej Keld,
>> I liked your algo but had to think it over. After I've slept on, it a
>> few things got into my mind:
>>
>> "My initial go on an algorithm is then: I found a homonym.
>> Each of the homonyms have a placement in the meaning tree via its father
>> and mother relations."
>>
>> Unfortunately, I've no idea what's the father relation. Maybe you should
>> follow only the mother relations?
>
> The father relation is meant to discriminate between the same mother 
> relations.
> So maybe it can be of help. I don't know. I take it into account to generalize
> wordnet-like structures, there may be more than one relation from a given 
> homonym

Saldo is not a WordNet (and it's creators don't claim that it is, only
that it is equivalent for some purposes), and this is one of the major
differences. WordNet synsets can have an unlimited number of typed
references, whereas Saldo has maximum two untyped references (what the
type is depends on the pair, and does not seem to be encoded anywhere
that's publicly available).

On the plus side, there is a relatively complete set of mappings
between the English WordNet and Saldo, so WordNet types could be
inferenced from those alignments, though how accurate the results
would be remains to be seen.

> And a general Apertium wordnet module and algoritm should be able
> to handle more than one upwards relation, In the monodix markup
> this could be then marked with a "rel" tag, and more
> "rel" tags may be present. I need input from people more in the know if this 
> could be
> the recommended way to mark up such meaning relations in the monodix.
>

The problem with using WordNet is that the synsets are simultaneously
too fine grained -- i.e., they represent a distinction without a
difference when it comes to translation, such as 'tree' the plant vs.
'tree' meaning a tree-like structure (parse tree, family tree, etc.)
-- and too coarse grained -- synsets are conceptual, rather than
lexical, so while 'panther' and 'leopard' are the same animal, we can
never say 'black leopard' or 'a panther never changes its spots' -- to
be useful for MT. In addition, there is no indication of the relative
importance of a sense, which may be too obscure for inclusion in a
translation lexicon (e.g., 'torpedo' meaning 'hitman' is a sense of
that word that I have only seen in WordNet).

If you were to give some thought to how you might split, merge, and
prune WordNet synsets into something that's useful for translation,
then you might be able to generate some interest.

-- 
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to