Thanks, I see you point. Yes its arguable whether one should assume the disambiguation page as a sense inventory for WSD. It's true that the other approach clearly has certain advantages but also possibly extensive computational overheads. In my case i am considering other tasks rather than WSD, say determinig the degree of synonymity of two words, where no contexts are given for disambiguation. Typically one evaluates this by looking at the possible senses for each term - in theory I can also take the whole knowledge base as the possible candidate sense inventory regardless of what are listed on the "disambiguation" page but that seems to be lots of computation and impractical.
Still, thanks for your input! On 23/05/2012 16:28, Aleksander Pohl wrote: > On 23.05.2012 16:41, Ziqi Zhang wrote: >> Thanks for your quick reply and the pointers. >> >> I suppose you do use the Wikipedia disambiguation page to look for >> candidate senses of a term, considering it as a sense inventory in >> general. Whether you should use the dbpedia disambiguation page as an >> alternative I dont know, since it is related to my question and my >> observation that it is not a 100% mirroring of the wikipedia version. > No, I do not use these pages to perform disambiguation. DBpedia spotlight > might use these pages - but I don't think they are the primary source of > interpretations. Wikipedia Miner doesn't use them, since the the > algorithm requires statistical data about the different senses, which > are not available on the disambiguation pages. > >> While both [1] and [2] deals with sense disambiguation, it is not clear >> how they select "candidate senses". in [1] "We use the DBpedia >> Lexicalization datasetfor determining candidate disambiguations for each >> surfaceform." and by refering to the spotlight webpage ".... (Wikipedia) >> Disambiguations provide ambiguous surface forms that are 'confusable' >> with all resources they link to. Their labels become surface forms for >> all target resources in the disambiguation page." - which suggests that >> dbpedia also uses the Wikipedia "disambiguation page" to look for >> candidate senses of a term, but perhaps some "filtering" strategies are >> used; in [2] candidate spotting is not discusssed. > This is a bold statement, citing "Learning to link with Wikipedia" (page > 2 - this passage covers the related work of Mihalcea and Csomai): > > """ > The next phase, disambiguation, ensures that the detected phrases > link to the appropriate article. For most anchors, there are several > destinations to choose from. The term plane, for example, usually > links to an article about fixed wing aircraft. Sometimes, however, it > points to a page describing a theoretical surface of infinite area and > zero depth, or a tool for flattening wooden surfaces. To choose the > most appropriate destination, Wikify’s best approach extracts > features from the phrase and its surrounding words (the terms > themselves and their parts of speech), and compares this to training > examples obtained from the entire Wikipedia. When run over > anchors obtained from Wikipedia articles, this is able to match the > manually defined destinations with a precision of 93% and a recall > of 83%. However, it requires enormous preprocessing effort, > because the entire Wikipedia must be parsed. > """ > >> As to my question, I am curious in how a "disambiguation page" in >> wikipedia is converted to a dbpedia page, such that in a lot of cases, >> many candidate links on the wikipedia page (e.g., >> "wikipedia/wiki/Cat_(disambiguation)") are not included as >> disambiguation candidates on the corresponding dbpedia "disambiguation" >> page (i.e., "dbpedia.org/page/Cat_(disambiguation)". > I know, I haven't answered for that question, since I do not know how > this is done. Still I would suggest to ignore this problem, assuming you > wish to build a decent Wikipedia-based disambiguation algorithm. > > Cheers, > Aleksander -- Ziqi Zhang ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
