Hi Federico,

As far as I know there is no such dataset generated for DBpedia yet.

Regarding your specific problem:

With a bit of coding you could generate this information quite fast. Load
the Wikipedia dump in Apache spark and extract the required info you need
about a page, then encode the necessary information in rdf in case you need
to query it by Sparql. Here is a tutorial that shows you how to do the hard
parts [1]

Regarding the integration in DBpedia:

I guess it would be a good idea to submit a more elaborate version of this
metadata extraction for the GSoC 2016.
Word counts, unique word statistics could be generated by a new extractor,
but things like page counts should be handled by a separate project since
they require a separate data dump. Part of the data should be probably part
of the DBpedia NLP Datasets [3], other information such as page counts
makes more sense coupled with DBpedia Live.

Cheers,
Alexandru

[1] http://www.teachingmachines.io/blog/2015/12/20/wikipedia-data-in-spark
[2] https://dumps.wikimedia.org/other/pagecounts-all-sites/
[3] http://dbpedia.org/services-resources/datasets/nlp

On Sun, Jan 24, 2016 at 7:11 PM, Federico Piovesan <[email protected]>
wrote:

> Hi everyone,
>
> I am new to the community and I wanted to ask if there is a way to extract
> statistic about an entity's page from the Dbpedia SPARQL endpoint
> <http://dbpedia.org/snorql/>. For example, I would like to choose a
> specific entity (e.g. city of a country) or a group of entities (e.g. all
> cities in a region of a country) and obtain:
>
> - number of words in that entity's wikipedia page
> - number of pictures included in the page
> - stats on edits and revisions
> - stats on visits to that page
>
> I know strat.grok.se <http://stats.grok.se/> and X!'s Tools
> <https://tools.wmflabs.org/xtools-articleinfo/> for statistics on page
> visits and edits respectively, but I am not sure whether their information
> is in Dbpedia's database (or perhaps Wikidata's?).
> Has anyone tried to do something similar? Any suggestion?
>
>
> Wish you all a great day,
> Federico
>
>
> ------------------------------------------------------------------------------
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to