On 9 October 2012 14:14, <[email protected]> wrote: > On Tue, Oct 09, 2012 at 09:41:41AM +0200, Per Tunedal wrote: >> Hej Keld, >> I liked your algo but had to think it over. After I've slept on, it a >> few things got into my mind: >> >> "My initial go on an algorithm is then: I found a homonym. >> Each of the homonyms have a placement in the meaning tree via its father >> and mother relations." >> >> Unfortunately, I've no idea what's the father relation. Maybe you should >> follow only the mother relations? > > The father relation is meant to discriminate between the same mother > relations. > So maybe it can be of help. I don't know. I take it into account to generalize > wordnet-like structures, there may be more than one relation from a given > homonym
Saldo is not a WordNet (and it's creators don't claim that it is, only that it is equivalent for some purposes), and this is one of the major differences. WordNet synsets can have an unlimited number of typed references, whereas Saldo has maximum two untyped references (what the type is depends on the pair, and does not seem to be encoded anywhere that's publicly available). On the plus side, there is a relatively complete set of mappings between the English WordNet and Saldo, so WordNet types could be inferenced from those alignments, though how accurate the results would be remains to be seen. > And a general Apertium wordnet module and algoritm should be able > to handle more than one upwards relation, In the monodix markup > this could be then marked with a "rel" tag, and more > "rel" tags may be present. I need input from people more in the know if this > could be > the recommended way to mark up such meaning relations in the monodix. > The problem with using WordNet is that the synsets are simultaneously too fine grained -- i.e., they represent a distinction without a difference when it comes to translation, such as 'tree' the plant vs. 'tree' meaning a tree-like structure (parse tree, family tree, etc.) -- and too coarse grained -- synsets are conceptual, rather than lexical, so while 'panther' and 'leopard' are the same animal, we can never say 'black leopard' or 'a panther never changes its spots' -- to be useful for MT. In addition, there is no indication of the relative importance of a sense, which may be too obscure for inclusion in a translation lexicon (e.g., 'torpedo' meaning 'hitman' is a sense of that word that I have only seen in WordNet). If you were to give some thought to how you might split, merge, and prune WordNet synsets into something that's useful for translation, then you might be able to generate some interest. -- <Sefam> Are any of the mentors around? <jimregan> yes, they're the ones trolling you ------------------------------------------------------------------------------ Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
