[repost with proper subscribed mail address] Alex wrote:
> The plain pageview stats are already available. > Erik Zachte has been doing some work on other stats. > <http://stats.wikimedia.org/EN/VisitorsSampledLogRequests.htm> > If I were to compile a wishlist of stats things: > 1. stats.grok.se data for non-Wikipedia projects 2. A better interface > for stats.wikimedia.org - There's a lot of data there, but it can be > hard to find it and its not very publicized. The only reason I knew > about the link above is because someone pointed it out to me once and > I bookmarked it. > 3. Pageview stats at <http://dammit.lt/wikistats/> in files based on > projects. It would be a lot easier for people at the West Flemish > Wikipedia to analyze statistics themselves if they didn't have to > download tons of data they don't need. Your enhancement requests: 1 IIRC this is already a (albeit undocumented) feature. One can manually alter the url to find e.g. wiktionary stats. But I forgot precisely how and see nothing on User:Henriks talk page. 2 Seconded whole heartedly. In fact I started to reshape the main page (just eight links) this week :) I just uploaded it a bit earlier than planned: http://stats.wikimedia.org/ 3 That could be a useful extension on the preservation script described below. -------------------------------- General response I would say since begin 2008 quite a lot has happened. A recap: As already has been said Domas' (and Tim's) work was a major step forward. http://dammit.lt/wikistats/ Two very useful aggregators of these on a page by page basis are http://stats.grok.se/ http://wikistics.falsikon.de/ Based on the same data, on a higher aggregation level there are visitors counts for all projects in a easily digestible fashion http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm Also since two months we know much more about Wikimedia traffic based on 8 reports with all kinds of cross sections: http://infodisiac.com/blog/2009/04/wikimedia-traffic-analyzed/ With regard to dammit.lt raw data I helped to preserve these for posterity in a more compact and slightly filtered state, so that we can query them much longer. (dammit.lt server has space for one or two months) Actually Mathias Schindler started this important rescue effort. Each day all files are downloaded and processed, reduced from 40 Gb per month to 3 Gb (May 2009). I also made a script to query these files, which is much more efficiently than processing the original hourly files. But runtime is still considerably so querying these files without restraints through a public interface is not advisable. But the toolserver could get a copy of the files of course. http://infodisiac.com/blog/wp-content/uploads/2009/05/influenza1.png Is this enough? Of course not, there is so much more to learn. Considering geo data: for many months a patch for Domas' (and Tims) code has been laying around, by Antonio José Reinoso Peinado, that would add country level geolocation data from Maxmind's public database (ip->geo lookup). Although I promised to look at it, I haven't found the time yet. Considering web bugs: comScore also proposed such a scheme to us. Apart from the question how much it would bring us that we don't or can't figure out ourselves an overriding concern is privacy. Erik Zachte Data Analyst Wikimedia Foundation, Inc. E-Mail: [email protected] _______________________________________________ foundation-l mailing list [email protected] Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
