Hi Richard, I'm pretty sure that the first two are not available in the Wikipedia dumps. For example, [1] lists pages-meta-current.xml.bz2 as "All pages, current versions only." I don't think there is a dump of all pages. There once may have been one, but it probably became too big.
For the view count, see [2]. But hey, I also found the following at [3]: "Domas Mituzas put together a system to gather access statistics from wikipedia's squid cluster and publishes it here" [4]. The inlink count can of course be extracted, either from the Wikipedia dump [5] or from DBpedia [6]. I wrote a bit of Java code that does exactly that because I needed it for the faceted browser [7], but didn't publish the results. I can send you the code if you want, though it's not really "productized". Cheers, Christopher [1] http://download.wikipedia.org/enwiki/20091026/ [2] http://lists.wikimedia.org/pipermail/wikitech-l/2007-September/033499.html [3] http://stats.grok.se/about [4] http://dammit.lt/wikistats/ [5] http://download.wikipedia.org/enwiki/latest/enwiki-latest-pagelinks.sql.gz [6] http://downloads.dbpedia.org/3.4/en/pagelinks_en.nt.bz2 [7] http://dbpedia.neofonie.de On Tue, Nov 10, 2009 at 23:26, Richard Cyganiak <[email protected]> wrote: > Hi, > > I was wondering if the following data is available anywhere as part of > DBpedia, or otherwise if there's any hope of getting it from DBpedia > in the future. I think, but I'm not sure, that the raw data should be > availabe in the Wikipedia database dumps. > > 1. View counts for Wikipedia pages. > > 2. Total number of edits for each Wikipedia page. > > 3. Inlink counts for Wikipedia pages. > > The first two are attention data. That's an interesting aspect of > Wikipedia that isn't fully exploited yet. There are interesting > applications where I could learn stuff about my own dataset by meshing > it up with attention data from DBpedia. The third one is, in some way, > also a measure of attention, and can be useful for ranking. > > (I'm thinking about stuff that can be done with the New York Times > SKOS dataset, and using attention data from Wikipedia to gain insight > into the NYT data might be quite interesting.) > > So, any hint about how to get the data above would be appreciated. > > Best, > Richard > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Dbpedia-discussion mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
