On 17 Dec 2009, at 12:25, Nguyen Thanh Tu wrote: > I'm wondering whether the disambiguation resource (e.g. > Apple_%28disambiguation%29<http://dbpedia.org/resource/Apple_%28disambiguation%29 > > >) > has all the disambiguation links or not. Because I've tried 2 > disambiguation > resources "Apple" and "Amphibian", and I see the results are > uncompleted > (explained below). > > http://dbpedia.org/resource/Apple_%28disambiguation%29 > http://dbpedia.org/resource/Amphibian_%28disambiguation%29<http://dbpedia.org/resource/Apple_%28disambiguation%29 > > > > > With the first link, the results include the disambiguation link " > http://dbpedia.org/resource/apple<http://dbpedia.org/resource/Apple_%28disambiguation%29 > > >", > (not > "http://dbpedia.org/resource/Apple<http://dbpedia.org/resource/Apple_%28disambiguation%29 > > >"), > therefore, I couldn't go from > Apple_%28disambiguation%29<http://dbpedia.org/resource/Apple_%28disambiguation%29 > > > > to > http://dbpedia.org/resource/Apple<http://dbpedia.org/resource/Apple_%28disambiguation%29 > > > > .
The link in Wikipedia is [[apple]], so DBpedia extracts a lowercase link. In Wikipedia, the first letter of page names is case- insensitive. In DBpedia, the entire URI is case-sensitive (that's prescribed by RDF). So the lower-case link works in Wikipedia but not in DBpedia. I wonder wether it would make sense to normalize the first character in page names to upper-case in DBpedia. > With the second link, the results don't include the link > http://dbpedia.org/resource/<http://dbpedia.org/resource/Apple_%28disambiguation%29 > > > > Amphibian <http://dbpedia.org/resource/Apple_%28disambiguation%29>, > and > include the link to "Amphibious_vehicle" which doesn't go to > anywhere else > (in fact, it should). The algorithm for extracting disambiguation links is fairly naïve, it extracts only links that include the original page name (in this case, "Amphibian"). So it would extract "Amphibian (song)", "Amphibian aircraft", but not "Amphibious vehicle". That's a limitation of the algorithm -- to always do the right thing, it would have to analyse the layout and semantics of the wiki text, which is a bit beyond the DBpedia project's means. I'm sure that the DBpedia team would accept patches that improve the algorithm though -- all the code is open! (Currently, "Amphibian aircraft" is not in the extracted RDF data either. That's because of a bug which has been fixed after the last extraction.) Best, Richard > > Do you know what I should do to get all the information of > disambiguation > for one word? > Thank you, > Cheers, > Thanh-Tu > ------------------------------------------------------------------------------ > This SF.Net email is sponsored by the Verizon Developer Community > Take advantage of Verizon's best-in-class app development support > A streamlined, 14 day to market process makes app distribution fast > and easy > Join now and get one step closer to millions of Verizon customers > http://p.sf.net/sfu/verizon-dev2dev > _______________________________________________ > Dbpedia-discussion mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
