I'm not sure, but there are two possible explanation:
* Dbpedia has Yago categories in addition to wikipedia categories
* have you run your own Dbpedia extractor? Because if it's not so you could
have searched the Dbpedia information that is several months old!
-----
Yury Katkov




On Mon, Feb 13, 2012 at 5:01 PM, Gregor Trefs
<gtr...@rumms.uni-mannheim.de>wrote:

>  Hi DBPedia-Community,
>
> I'm currently writing my Master-Thesis in the field of DBPedia and SPARQL.
> One of my subgoals is to find out how many categories are present in both
> Wikipedia and DBPedia. Therefore, I wrote a little tool which identifies
> all categories having at least one resource in the unspecific mapping based
> part of DBPedia (If I refer to DBPedia in this mail, I usually mean this
> part of DBPedia not the whole one.). It searches the file
> mapping_based_properties_en.nt and looks whether or not the object and
> subject of each statement is linked to a category in the file
> article_categories_en.nt. If there is a link, the tool considers the
> corresponding category to be 'present' in DBPedia.
>
> On the other hand, the same tool searches the page_links_en.nt file to
> find all categories of Wikipedia. That is, all triples which relate a
> resource to a category or (if present at all) a category to any object.
> According to the description of the 'Page Links Extractor' it 'Extracts
> internal links between DBpedia instances from the internal pagelinks
> between Wikipedia articles.'. As Wikipedia pages normally link to their
> categories, I assumed that these links are also included and, thus, all
> categories in Wikipedia are captured.
>
> Unfourtnately, this is only true for almost all categories. I found 127
> categories which are present in DBPedia but not in Wikipedia, compared to
> 59099 categories present in Wikipedia and not in DBPedia. This is strange,
> as the set of DBPedia categories must be a subset of Wikipedia categories.
> Otherwise, some magic added some new categories during extraction and I
> doubt that. I made sure, it was not my fault and had a look on the data.
> One of the suddenly appeared categories is
> http://dbpedia.org/resource/Category:Alaska_elections,_1996. On the
> DBPediasian side, there is a triple (
> <http://dbpedia.org/resource/United_States_Senate_election_in_Alaska,_1996><http://dbpedia.org/resource/United_States_Senate_election_in_Alaska,_1996>
> <http://purl.org/dc/terms/subject> <http://purl.org/dc/terms/subject>
> <http://dbpedia.org/resource/Category:Alaska_elections,_1996><http://dbpedia.org/resource/Category:Alaska_elections,_1996>.)
>  which relates this category to the United states Senate election in
> Alaska in 1996. The resource itself is subject of two statements in
> mapping_based_properties_en.nt. On the Wikipediasian side, I did not find
> any triple in page_links_en.nt which contained the category. But I did find
> the United states senate election in Alaska in 1996 resource. The
> corresponding Wikipedia page also includes a link to the category. It is
> present since page creation.
>
> What is the reason for this ?
> * Is my assumption wrong, that the internal pagelinks also include links
> to categories ?
>     * If yes
>         * Why were almost all categories captured ?
>         * Should I use the article_categories_en.nt file for Wikipedia,
> too ?
> * Did the Pagelinks Extractor skip corresponsing LinkNode during traversal
> of the AST ?
> * Does the extraction source miss this information ?
>
> I'm looking forward to your answers.
>
> Regards,
> Gregor Trefs
>
>
> ------------------------------------------------------------------------------
> Try before you buy = See our experts in action!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-dev2
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to