@Dario: I do not have time to check original files now, but I believe that difference reflects that we show aggregated data (i.e. data for the article PLUS all it's redirects). However, I would check raw files also.
On Tue, Mar 25, 2014 at 3:51 PM, Dario Taraborelli < [email protected]> wrote: > I haven’t checked the raw logs to compare them with the visualization but > I think we should QA the data: the raw (unsmoothed) series for Eros shows a > spike on 2/14 (Valentine’s Day, predictably) with 6,920 pageviews, while > stats.grok.se reports for the same date 3,209 page views. I don’t think > any interpolation for missing data occurred around that date. > > [1] http://www.wikipediatrends.com/?query[]=Eros > [2] http://stats.grok.se/en/latest90/Eros > > On Mar 25, 2014, at 7:09 AM, Dario Taraborelli <[email protected]> > wrote: > > On Mar 25, 2014, at 7:01 AM, Alex Druk <[email protected]> wrote: > > @Dario: thanks. yes, we renew the site once in a month, usually around > 10th of each month because dependence on dumps. > And yes, we plan to introduce JSON > > > awesome > > also, I noticed some inconsistency in the heading/titles that you may want > to fix: “Wikipedia Articles Trends”, “Wikipedia trends”, “Wikipedia > pageview statistics”, “Wiki Trends”. > > Dario > > On Tue, Mar 25, 2014 at 2:32 PM, Dario Taraborelli < > [email protected]> wrote: > >> apologies, s/Burton/Alex :) >> >> one more question: is there any plan to add a JSON interface on top of >> the CSV download? Many people have relied on stats.grok.se JSON output >> for years and it would be fantastic to have wikipediatrends return data in >> the same format. >> >> Dario >> >> >> >> On Mar 25, 2014, at 6:27 AM, Dario Taraborelli <[email protected]> >> wrote: >> >> Hi Burton, >> >> nicely done (and yay for using dygraphs) – with what frequenty do you >> expect wikipediatrends to ingest new data from the raw pageview dumps? I >> assume it’s once a month? >> >> Dario >> >> On Mar 25, 2014, at 2:09 AM, Alex Druk <[email protected]> wrote: >> >> Hi Burton, >> >> We just opened a new site www.wikipediatrends.com that show Wikipedia >> page view data. Our site is very similar to existing >> http://tools.wmflabs.org/wikiviewstats/ and http://stats.grok.se/, but >> use slightly different approach to calculating and presenting data as well >> as allow comparison of different articles. >> >> I hope it will serve your purpose. I am ready to discuss integration out >> of the list. >> >> Alex Druk >> >> >> On Mon, Mar 24, 2014 at 11:40 PM, Burton DeWilde < >> [email protected]> wrote: >> >>> Dear Toby, >>> >>> I recently saw your comment on a blog >>> post<http://magnusmanske.de/wordpress/?p=173>by Magnus Manske regarding the >>> lack of Wikipedia page view data besides the >>> oft-overloaded http://stats.grok.se/. I was wondering if there's been >>> any progress at WMF on building a more stable, central, and complete source >>> for this data? >>> >>> I ask because I'm a data scientist at a small research non-profit called >>> Harmony >>> Institute <http://harmony-institute.org/>, where we study the social >>> impact of media (primarily television and film). I'm currently building an >>> interactive web app <http://harmony-institute.org/work/impactspace/>that >>> visualizes social impact on a variety of issues by many documentary >>> films. One indicator of interest is "information-seeking behavior," i.e. >>> are audiences seeking out information about a film or issue. Besides Google >>> search trends, an excellent proxy for this is Wikipedia page views for both >>> film pages, e.g. Escape >>> Fire<http://en.wikipedia.org/wiki/Escape_Fire:_The_Fight_to_Rescue_American_Healthcare>, >>> and issue-related pages, e.g. Health care >>> reform<http://en.wikipedia.org/wiki/Health_care_reform> >>> . >>> >>> I'm currently trying to use stats.grok.se to grab raw data in JSON >>> form; unfortunately, the site almost always responds with "Server >>> overloaded, please throttle your requests," and no amount of throttling >>> seems to suffice. I'm aware that there are many TBs of raw data for the >>> downloading, but I don't have the resources to handle that much data, nor >>> do I need more than the tiniest fraction of it. >>> >>> I would *love* to show Wikipedia page view statistics for film pages in >>> our app. If you have any updates on progress or suggestions on how I might >>> do this, I would be very appreciative. >>> >>> Thanks very much for your and all of WMF's hard work — I'm a proud donor >>> to the cause. :) >>> >>> Best, >>> Burton DeWilde >>> >>> -- >>> Burton DeWilde >>> >>> Data Scientist >>> Harmony Institute >>> harmony-institute.org >>> blog <http://harmony-institute.org/therippleeffect/> | >>> twitter<https://twitter.com/hinstitute>| >>> facebook <https://www.facebook.com/harmonyinstitute> >>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> >> -- >> Thank you. >> >> Alex Druk >> [email protected] >> (775) 237-8550 Google voice >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > > -- > Thank you. > > Alex Druk > [email protected] > (775) 237-8550 Google voice > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > > > > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > > -- Thank you. Alex Druk [email protected] (775) 237-8550 Google voice
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
