Tilman, to answer your question, the presentation of analytics at Monthly Metrics Meetings will change month to month. Next month I am on vacation so I have asked Jon to present something. I'm assuming it will have Pageviews and be readership focused - it's up to Jon.
On Mon, Aug 17, 2015 at 4:16 PM, Oliver Keyes <[email protected]> wrote: > This seems perfect. Is it currently used? > > On 17 August 2015 at 18:03, Andrew Otto <[email protected]> wrote: > > BTW, Christian foresaw this issue and wrote this: > > https://github.com/wikimedia/analytics-refinery-source/tree/master/guard > > > > It should be useable for pageviews too, I think. For this issue, a > guard that made sure that outreach.wikimedia.org never appeared would > have been an error. > > > > > > > > > > > >> On Aug 17, 2015, at 14:45, Oliver Keyes <[email protected]> wrote: > >> > >> On 17 August 2015 at 13:48, Joseph Allemandou < > [email protected]> wrote: > >>> Hey Oliver, > >>> > >>> The analytics team is responsible for the pageview definition. > >>> When finding issues, sending an email to the analytics mailing list is > the > >>> right thing to do :) > >>> > >> > >> Indeed; my point is not about issues reported upstream. My point is > >> that there appears to currently be absolutely no work done to take > >> this (org-level, highest possible priority) KPI and evaluate it every > >> month or ever N days to make sure that, even with the gradual > >> accretion of changes to the input data, it is still extracting what we > >> want. It is down to user-reported issues. The problem with this > >> approach is that after 90 days it is impossible to rerun the data; if > >> there is a bug breaking the logs, and it takes more than 90 days to > >> discover it, those logs are simply broken. > >> > >> In addition, discovering these issues requires a very granular > >> understanding of what the pageviews logs are meant to be capturing > >> that most customers simply will not have. It worked in this case > >> primarily because the customer actually /wrote/ the definition ;p. > >> > >> For public transparency: Joseph and I talked on IRC and will be > >> working on ways to validate data and detect these kinds of regressions > >> in advance. > >> > >>> On our end, we could surely do a better job to communicate changes in > the > >>> pageview definition code for anybody interested to review/comment/ask > for > >>> documentation. > >>> Emails have been sent regularly about updates on the analytics list, > except > >>> in the past few month. > >>> We shall get back to that good habit and send notifications with > >>> explanations of the changes. > >>> > >>> Joseph > >>> > >>> > >>> > >>> > >>> On Mon, Aug 17, 2015 at 5:15 PM, Oliver Keyes <[email protected]> > wrote: > >>>> > >>>> You should also note that donate-wiki pageviews are making it into the > >>>> counts (again, the definition was designed to exclude these). > >>>> > >>>> Whose job is it to review pageviews and update the definition when > >>>> issues are found? > >>>> > >>>> On 17 August 2015 at 10:32, Oliver Keyes <[email protected]> > wrote: > >>>>> Just to clarify; there is no need to ask me before making changes > >>>>> (obviously I find my approval for pageviews changes being sought > >>>>> incredibly flattering, but I am not the only person involved in this > >>>>> project ;p). What I'm more driving towards is directly informing > >>>>> customers when the definition is adapted. > >>>>> > >>>>> On 17 August 2015 at 10:31, Oliver Keyes <[email protected]> > wrote: > >>>>>> Excellent; thank you. > >>>>>> > >>>>>> On 17 August 2015 at 04:42, Joseph Allemandou > >>>>>> <[email protected]> wrote: > >>>>>>> Oliver, > >>>>>>> > >>>>>>> It was a mistake from me to add the 'outreach' subdomain without > >>>>>>> asking you. > >>>>>>> > >>>>>>> From a documentation perspective, the analytics team uses that > place > >>>>>>> to > >>>>>>> document changes: > >>>>>>> https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest and > I > >>>>>>> didn't > >>>>>>> know about up-to-date documentation you sent. > >>>>>>> > >>>>>>> Tickets have been created to both correct the bug and update the > >>>>>>> documentation pages. > >>>>>>> > >>>>>>> Joseph > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Sun, Aug 16, 2015 at 8:47 PM, Oliver Keyes < > [email protected]> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> Ah, I see the problem; someone patched it and never documented it. > >>>>>>>> > >>>>>>>> We have documentation at > >>>>>>>> > >>>>>>>> > https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters > >>>>>>>> of the generalised filters. There is also a log, on > >>>>>>>> https://meta.wikimedia.org/wiki/Research:Page_view, of changes > to the > >>>>>>>> pageview definition. > >>>>>>>> > >>>>>>>> The intent behind both the transparent definition and the log is > to > >>>>>>>> ensure that we know what is going /in/ the definition. > >>>>>>>> > >>>>>>>> In this case, somebody has patched the definition > >>>>>>>> > >>>>>>>> > >>>>>>>> ( > https://github.com/wikimedia/analytics-refinery-source/commit/cc0b6ed7e4f403eaa82235ec6a0f27152b0c2710 > ) > >>>>>>>> to include traffic from outreach.wikimedia.org - a site that was > very > >>>>>>>> deliberately and very explicitly excluded from the definition as > it > >>>>>>>> was written. > >>>>>>>> > >>>>>>>> There is no explanation of why this change was made, there is no > >>>>>>>> documentation of this change even existing outside the actual > >>>>>>>> Java.... > >>>>>>>> can someone please explain what this is for, and update all the > >>>>>>>> documentation to reflect that? And then could people be very, very > >>>>>>>> clear in future that it is expected there be a log of alterations > you > >>>>>>>> make to high-level KPIs beyond the, you know, commit logs. > >>>>>>>> > >>>>>>>> On 16 August 2015 at 14:32, Madhumitha Viswanathan > >>>>>>>> <[email protected]> wrote: > >>>>>>>>> The new one. > >>>>>>>>> > >>>>>>>>> The code that generates it - > >>>>>>>>> > >>>>>>>>> - > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > https://github.com/wikimedia/analytics-refinery/blob/master/hive/pageview/hourly/create_pageview_hourly_table.hql > >>>>>>>>> - > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > https://github.com/wikimedia/analytics-refinery/tree/master/oozie/pageview/hourly > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Sun, Aug 16, 2015 at 11:01 AM, Oliver Keyes > >>>>>>>>> <[email protected]> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Is the pageviews_hourly table meant to contain pageviews > according > >>>>>>>>>> to > >>>>>>>>>> the new or old definition? If old, where can I find aggregates > for > >>>>>>>>>> the > >>>>>>>>>> new one? > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Oliver Keyes > >>>>>>>>>> Count Logula > >>>>>>>>>> Wikimedia Foundation > >>>>>>>>>> > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> Analytics mailing list > >>>>>>>>>> [email protected] > >>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> --Madhu :) > >>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> Analytics mailing list > >>>>>>>>> [email protected] > >>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Oliver Keyes > >>>>>>>> Count Logula > >>>>>>>> Wikimedia Foundation > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> Analytics mailing list > >>>>>>>> [email protected] > >>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Joseph Allemandou > >>>>>>> Data Engineer @ Wikimedia Foundation > >>>>>>> IRC: joal > >>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> Analytics mailing list > >>>>>>> [email protected] > >>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Oliver Keyes > >>>>>> Count Logula > >>>>>> Wikimedia Foundation > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Oliver Keyes > >>>>> Count Logula > >>>>> Wikimedia Foundation > >>>> > >>>> > >>>> > >>>> -- > >>>> Oliver Keyes > >>>> Count Logula > >>>> Wikimedia Foundation > >>>> > >>>> _______________________________________________ > >>>> Analytics mailing list > >>>> [email protected] > >>>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>> > >>> > >>> > >>> > >>> -- > >>> Joseph Allemandou > >>> Data Engineer @ Wikimedia Foundation > >>> IRC: joal > >>> > >>> _______________________________________________ > >>> Analytics mailing list > >>> [email protected] > >>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>> > >> > >> > >> > >> -- > >> Oliver Keyes > >> Count Logula > >> Wikimedia Foundation > >> > >> _______________________________________________ > >> Analytics mailing list > >> [email protected] > >> https://lists.wikimedia.org/mailman/listinfo/analytics > > > > > > _______________________________________________ > > Analytics mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/analytics > > > > -- > Oliver Keyes > Count Logula > Wikimedia Foundation > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
