On 17 Dec 2009, at 12:25, Nguyen Thanh Tu wrote:
> I'm wondering whether the disambiguation resource (e.g.
> Apple_%28disambiguation%29<http://dbpedia.org/resource/Apple_%28disambiguation%29
>  
> >)
> has all the disambiguation links or not. Because I've tried 2  
> disambiguation
> resources "Apple" and "Amphibian", and I see the results are  
> uncompleted
> (explained below).
>
> http://dbpedia.org/resource/Apple_%28disambiguation%29
> http://dbpedia.org/resource/Amphibian_%28disambiguation%29<http://dbpedia.org/resource/Apple_%28disambiguation%29
>  
> >
>
> With the first link, the results include the disambiguation link "
> http://dbpedia.org/resource/apple<http://dbpedia.org/resource/Apple_%28disambiguation%29
>  
> >",
> (not 
> "http://dbpedia.org/resource/Apple<http://dbpedia.org/resource/Apple_%28disambiguation%29
>  
> >"),
> therefore, I couldn't go from
> Apple_%28disambiguation%29<http://dbpedia.org/resource/Apple_%28disambiguation%29
>  
> >
> to 
> http://dbpedia.org/resource/Apple<http://dbpedia.org/resource/Apple_%28disambiguation%29
>  
> >
> .

The link in Wikipedia is [[apple]], so DBpedia extracts a lowercase  
link. In Wikipedia, the first letter of page names is case- 
insensitive. In DBpedia, the entire URI is case-sensitive (that's  
prescribed by RDF). So the lower-case link works in Wikipedia but not  
in DBpedia.

I wonder wether it would make sense to normalize the first character  
in page names to upper-case in DBpedia.

> With the second link, the results don't include the link
> http://dbpedia.org/resource/<http://dbpedia.org/resource/Apple_%28disambiguation%29
>  
> >
> Amphibian <http://dbpedia.org/resource/Apple_%28disambiguation%29>,  
> and
> include the link to "Amphibious_vehicle" which doesn't go to  
> anywhere else
> (in fact, it should).

The algorithm for extracting disambiguation links is fairly naïve, it  
extracts only links that include the original page name (in this case,  
"Amphibian"). So it would extract "Amphibian (song)", "Amphibian  
aircraft", but not "Amphibious vehicle". That's a limitation of the  
algorithm -- to always do the right thing, it would have to analyse  
the layout and semantics of the wiki text, which is a bit beyond the  
DBpedia project's means. I'm sure that the DBpedia team would accept  
patches that improve the algorithm though -- all the code is open!

(Currently, "Amphibian aircraft" is not in the extracted RDF data  
either. That's because of a bug which has been fixed after the last  
extraction.)

Best,
Richard



>
> Do you know what I should do to get all the information of  
> disambiguation
> for one word?
> Thank you,
> Cheers,
> Thanh-Tu
> ------------------------------------------------------------------------------
> This SF.Net email is sponsored by the Verizon Developer Community
> Take advantage of Verizon's best-in-class app development support
> A streamlined, 14 day to market process makes app distribution fast  
> and easy
> Join now and get one step closer to millions of Verizon customers
> http://p.sf.net/sfu/verizon-dev2dev  
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to