Hi Jörn,

we computed some graph measures based on the Wikipedia link structure. Using 
the actual link structure seemed for a user’s perspective more appropriate than 
the mapped properties. Therefore, we also had to clean up the page_links 
dataset and computed Page Rank, HITS, Inlink and Outlink degree of each article 
in Wikipedia. You can find the datasets for EN and DE, 3.9 and 2014 DBpedia at 
[1].

Maybe, that helps.

Best
Magnus

[1] http://s16a.org/node/6

Am 28.01.2015 um 17:54 schrieb Jörn Hees <j_h...@cs.uni-kl.de>:

> Hi,
> 
> it seems it's not as easy as i had thought to get the top subjects, 
> predicates and objects, as SPARQL queries such as this
> ```
> SELECT ?n COUNT(*) AS ?c
> WHERE {
>  ?n ?p ?o.
> }
> ORDER BY DESC(?c)
> LIMIT 10
> ```
> just time out / return with partial results.
> 
> 
> So i compiled them from the NT dumps, as described here (also see for full 
> files):
> https://joernhees.de/blog/2015/01/28/dbpedia-2014-stats-top-subjects-predicates-and-objects/
> 
> Thought this might be of interest to some of you.
> 
> 
> Turns out there's actually quite a lot of duplicate triples in the dumps:
>   4891 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Flag_of_Slovenia.svg?width=300>
>  <http://purl.org/dc/elements/1.1/rights> 
> <http://en.wikipedia.org/wiki/File:Flag_of_Slovenia.svg> .
>   4891 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Flag_of_Slovenia.svg> 
> <http://xmlns.com/foaf/0.1/thumbnail> 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Flag_of_Slovenia.svg?width=300>
>  .
>   4891 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Flag_of_Slovenia.svg> 
> <http://purl.org/dc/elements/1.1/rights> 
> <http://en.wikipedia.org/wiki/File:Flag_of_Slovenia.svg> .
>   1520 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Naval_Ensign_of_the_United_Kingdom.svg?width=300>
>  <http://purl.org/dc/elements/1.1/rights> 
> <http://en.wikipedia.org/wiki/File:Naval_Ensign_of_the_United_Kingdom.svg> .
>   1520 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Naval_Ensign_of_the_United_Kingdom.svg>
>  <http://xmlns.com/foaf/0.1/thumbnail> 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Naval_Ensign_of_the_United_Kingdom.svg?width=300>
>  .
>   1520 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Naval_Ensign_of_the_United_Kingdom.svg>
>  <http://purl.org/dc/elements/1.1/rights> 
> <http://en.wikipedia.org/wiki/File:Naval_Ensign_of_the_United_Kingdom.svg> .
>   1195 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Airplane_silhouette.svg?width=300>
>  <http://purl.org/dc/elements/1.1/rights> 
> <http://en.wikipedia.org/wiki/File:Airplane_silhouette.svg> .
>   1195 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Airplane_silhouette.svg> 
> <http://xmlns.com/foaf/0.1/thumbnail> 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Airplane_silhouette.svg?width=300>
>  .
>   1195 
> <http://commons.wikimedia.org/wiki/Special:FilePath/Airplane_silhouette.svg> 
> <http://purl.org/dc/elements/1.1/rights> 
> <http://en.wikipedia.org/wiki/File:Airplane_silhouette.svg> .
> 
> 
> 
> Top10 Subjects:
>   8118 <http://dbpedia.org/resource/Alphabetical_list_of_communes_of_Italy>
>   7110 <http://dbpedia.org/resource/List_of_places_in_Afghanistan>
>   6162 <http://dbpedia.org/resource/Index_of_Andhra_Pradesh-related_articles>
>   5857 
> <http://dbpedia.org/resource/List_of_populated_places_in_Bosnia_and_Herzegovina>
>   5712 <http://dbpedia.org/resource/2013_in_film>
>   5550 <http://dbpedia.org/resource/List_of_municipalities_of_Brazil>
>   5458 <http://dbpedia.org/resource/List_of_dialling_codes_in_Germany>
>   5405 
> <http://dbpedia.org/resource/IUCN_Red_List_vulnerable_species_(Plantae)>
>   5392 
> <http://dbpedia.org/resource/List_of_CJK_Unified_Ideographs,_part_3_of_4>
>   5392 
> <http://dbpedia.org/resource/List_of_CJK_Unified_Ideographs,_part_2_of_4>
>   5392 
> <http://dbpedia.org/resource/List_of_CJK_Unified_Ideographs,_part_1_of_4>
> 
> Top10 Predicates:
> 149707899 <http://dbpedia.org/ontology/wikiPageWikiLink>
> 86391520 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> 33958849 <http://www.w3.org/2002/07/owl#sameAs>
> 18731754 <http://purl.org/dc/terms/subject>
> 13926391 <http://www.w3.org/2000/01/rdf-schema#label>
> 13494896 <http://dbpedia.org/ontology/wikiPageRevisionID>
> 13494875 <http://www.w3.org/ns/prov#wasDerivedFrom>
> 13494819 <http://dbpedia.org/ontology/wikiPageID>
> 10948106 <http://dbpedia.org/ontology/wikiPageOutDegree>
> 10948106 <http://dbpedia.org/ontology/wikiPageLength>
> 
> Top10 Objects:
> 10948086 <http://xmlns.com/foaf/0.1/Document> .
> 10948086 "en"^^<http://www.w3.org/2001/XMLSchema#string> .
> 6239553 "1"^^<http://www.w3.org/2001/XMLSchema#nonNegativeInteger> .
> 2250659 <http://dbpedia.org/class/yago/PhysicalEntity100001930> .
> 2169386 <http://dbpedia.org/class/yago/Object100002684> .
> 2155200 <http://www.w3.org/2002/07/owl#Thing> .
> 1974654 <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#Agent> .
> 1974654 <http://dbpedia.org/ontology/Agent> .
> 1816213 <http://dbpedia.org/class/yago/YagoLegalActorGeo> .
> 1650316 <http://xmlns.com/foaf/0.1/Person> .
> 1649647 <http://wikidata.dbpedia.org/resource/Q5> .
> 1649647 <http://wikidata.dbpedia.org/resource/Q215627> .
> 1649647 <http://schema.org/Person> .
> 
> 
> Cheers,
> Jörn
> 
> 
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

-- 
Magnus Knuth

Hasso-Plattner-Institut für Softwaresystemtechnik GmbH
Prof.-Dr.-Helmert-Str. 2-3
14482 Potsdam

Amtsgericht Potsdam, HRB 12184
Geschäftsführung: Prof. Dr. Christoph Meinel

tel:     +49 331 5509 547
email:   magnus.kn...@hpi.de
web:     http://www.hpi.de/
webID:   http://magnus.13mm.de/


------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to