Hi Richard, On 10 Nov 2009, at 17:26, Richard Cyganiak wrote:
> Hi, > > I was wondering if the following data is available anywhere as part of > DBpedia, or otherwise if there's any hope of getting it from DBpedia > in the future. I think, but I'm not sure, that the raw data should be > availabe in the Wikipedia database dumps. > > 1. View counts for Wikipedia pages. > > 2. Total number of edits for each Wikipedia page. > > 3. Inlink counts for Wikipedia pages. The SIOC MediaWiki exporter [1] can be used for Wikipedia and exposes recursive links to all the edits of each wiki page, and links to other pages (both internal and external) for each version, cf [2]. Consequently, you can easily get 3 by retrieving the page with that exporter, and counting the links. However, currently, getting the number of edits implies recursive fetching and may be relatively long. Yet, if there is a way to directly get the number of edits via the MediaWiki API, we could add that feature to the export, same for requirement 1 (cc-ing Fabrizio that worked on it as I'm not aware if these features are in the API or not). In addition, I think that to efficiently link that information to DBPedia, it would require that DBPedia provides information about the version of Wikipedia pages that have been used for the export (e.g. version number / ID). That way, we could accurately link a DBPedia resource to the SIOC description of the corresponding Wikipedia page that has been used to extract this DBPedia information. Would it be something that can be done by the DBPedia team ? (e.g dbpedia:createdFrom -> wikipedia page + seeAlso to the SIOC-exported version to enabled interlinking) Best, Alex. [1] http://ws.sioc-project.org/mediawiki/ [2] http://ws.sioc-project.org/mediawiki/mediawiki.php?wiki=http%3A%2F%2Fen.wikipedia.org%2Fwiki%2FGalway&api= > > The first two are attention data. That's an interesting aspect of > Wikipedia that isn't fully exploited yet. There are interesting > applications where I could learn stuff about my own dataset by meshing > it up with attention data from DBpedia. The third one is, in some way, > also a measure of attention, and can be useful for ranking. > > (I'm thinking about stuff that can be done with the New York Times > SKOS dataset, and using attention data from Wikipedia to gain insight > into the NYT data might be quite interesting.) > > So, any hint about how to get the data above would be appreciated. > > Best, > Richard > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 > 30-Day > trial. Simplify your report design, integration and deployment - and > focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Dbpedia-discussion mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion -- Dr. Alexandre Passant Digital Enterprise Research Institute National University of Ireland, Galway :me owl:sameAs <http://apassant.net/alex> . ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
