It's not that challenging; Aaron and I developed a fairly robust way of doing it that Mikhail and I are refining. It's just not easy to do without, say, a dedicated EL schema that somebody (probably readership?) would own and surface data from.
On 18 September 2015 at 13:14, Gabriel Wicke <[email protected]> wrote: > This discussion also reminds me of the idea of tracking time spent on site. > Arguably, that's a more relevant measurement for how much of our content > people actually consume, and it also neatly side-steps issues like the > categorization of link previews. I realize that measuring that accurately > can be challenging, but I think it'll become more and more important as we > venture into more dynamic content experiences. > > > On Thu, Sep 17, 2015 at 8:17 AM, Oliver Keyes <[email protected]> wrote: >> >> Danke! >> >> On 17 September 2015 at 11:15, Nuria Ruiz <[email protected]> wrote: >> > Right! Thanks for pointing that out. >> > >> > I think I have updated all docs now: >> > https://meta.wikimedia.org/wiki/Research:Page_view#Change_log >> > >> > https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters >> > >> > On Thu, Sep 17, 2015 at 7:36 AM, Oliver Keyes <[email protected]> >> > wrote: >> >> >> >> Have those changes been noted on the main pageview definition page and >> >> associated changelog? >> >> >> >> On 17 September 2015 at 09:58, Nuria Ruiz <[email protected]> wrote: >> >> >>With more ways of viewing content, it is going to get harder and >> >> >> harder >> >> >> to >> >> >> maintain a pattern based definition. >> >> > Indeed, we want to move away from pattern based definition as mach as >> >> > possible. >> >> > >> >> > This is an FYI to everyone that with our latest changes (that we are >> >> > in >> >> > the >> >> > process of deploying today) if a request comes "tagged" with >> >> > "preview" >> >> > in >> >> > the x-analytics header it will not be counted towards a pageviews. >> >> > The >> >> > Android App should do corresponding changes to add the tag "preview" >> >> > to >> >> > its >> >> > preview requests. >> >> > >> >> > X-analytics header is documented here: >> >> > https://wikitech.wikimedia.org/wiki/X-Analytics >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > On Wed, Aug 19, 2015 at 7:19 AM, Andrew Otto <[email protected]> >> >> > wrote: >> >> >> >> >> >> > If we /do/ include RESTBase requests we will not only have to >> >> >> > rewrite the pageview definition for the apps to recognise the new >> >> >> > URL >> >> >> > scheme >> >> >> >> >> >> I really think that apps and APIs should do something proactive to >> >> >> tag >> >> >> or >> >> >> log a pageview. With more ways of viewing content, it is going to >> >> >> get >> >> >> harder and harder to maintain a pattern based definition. A >> >> >> pageview >> >> >> should >> >> >> be an event that is logged, not something that is pattern matched >> >> >> out >> >> >> of a >> >> >> very noisy stream of data. >> >> >> >> >> >> Most mediawiki requests do this now, via the page_id field in the >> >> >> X-Analytlics header, but we can’t use this for all pageviews because >> >> >> APIs >> >> >> are more complicated (e.g. more than one page can be served in a >> >> >> single >> >> >> request, etc.). In the longterm, there should be a pageview event >> >> >> stream >> >> >> just like rcstream! :) >> >> >> >> >> >> -Ao >> >> >> >> >> >> >> >> >> >> >> >> > On Aug 18, 2015, at 19:58, Oliver Keyes <[email protected]> >> >> >> > wrote: >> >> >> > >> >> >> > On 18 August 2015 at 19:11, Bernd Sitzmann <[email protected]> >> >> >> > wrote: >> >> >> >> This discussion is about needed updates of the definition and >> >> >> >> Analytics >> >> >> >> implementation for mobile apps page view metrics. There is also >> >> >> >> an >> >> >> >> associated Phab task[4]. Please add the proper Analytics project >> >> >> >> there. >> >> >> >> >> >> >> >> Background / Changes >> >> >> >> >> >> >> >> As you probably remember, the Android app splits a page view into >> >> >> >> two >> >> >> >> requests: one for the lead section and metadata, plus another one >> >> >> >> for >> >> >> >> the >> >> >> >> remainder. >> >> >> >> >> >> >> >> The mobile apps are going to change the way they load pages in >> >> >> >> two >> >> >> >> different >> >> >> >> ways: >> >> >> >> >> >> >> >> We'll add a link preview when someone clicks on a link from a >> >> >> >> page. >> >> >> >> We're planning on switching over the using RESTBase for loading >> >> >> >> pages >> >> >> >> and >> >> >> >> also the link preview (initially just the Android beta, ater >> >> >> >> more) >> >> >> >> >> >> >> > >> >> >> > Woah woah woah woah woah. By RESTBase do you mean Gabriel's >> >> >> > RESTful >> >> >> > service API? >> >> >> > >> >> >> > Last time I checked that wasn't even consumed by HDFS. Is it now >> >> >> > being >> >> >> > consumed by HDFS? >> >> >> > >> >> >> > More importantly the actual URLs are going to look /totally/ >> >> >> > different. If we do not include RESTBase requests, we will miss >> >> >> > the >> >> >> > apps. If we /do/ include RESTBase requests we will not only have >> >> >> > to >> >> >> > rewrite the pageview definition for the apps to recognise the new >> >> >> > URL >> >> >> > scheme, we will also potentially have to rewrite every /other/ bit >> >> >> > of >> >> >> > the definition to /not/ incorporate those requests. >> >> >> > >> >> >> > (I use "we" in a collective sense. This isn't my baby any more, >> >> >> > although if Joseph et al want help with the refactor here I'm >> >> >> > happy >> >> >> > to >> >> >> > spend my volunteer time on it). >> >> >> > >> >> >> > But basically every other bit of your email is important but now >> >> >> > secondary: this is a potentially massive change, all on its own, >> >> >> > even >> >> >> > without the link preview, even if the substance of the requests >> >> >> > going >> >> >> > to RESTBase were identical. >> >> >> > >> >> >> >> This will have implications for the pageviews definition and how >> >> >> >> we >> >> >> >> count >> >> >> >> user engagement. >> >> >> >> >> >> >> >> The big question is >> >> >> >> >> >> >> >> Should we count link previews as a page view since it's an >> >> >> >> indication >> >> >> >> of >> >> >> >> user engagement? Or should there be a separate metric for link >> >> >> >> previews? >> >> >> >> >> >> >> >> Counting page views >> >> >> >> >> >> >> >> IIRC we currently count action=mobileview§ions=0 query >> >> >> >> parameters >> >> >> >> of >> >> >> >> api.php as a page view. When we publish link previews for all >> >> >> >> Android >> >> >> >> app >> >> >> >> users then we would either want to count also the calls to >> >> >> >> action=query&prop=extracts as a page view or add them to another >> >> >> >> metric. >> >> >> >> >> >> >> >> Once the apps use RESTBase the HTTPS requests will be very >> >> >> >> different: >> >> >> >> >> >> >> >> Page view: Instead of action=mobileview§ions=0 the app would >> >> >> >> call >> >> >> >> the >> >> >> >> RESTBase endpoint for lead request[1] instead of the PHP API >> >> >> >> mentioned >> >> >> >> above. Then it would call [2]. >> >> >> >> Link preview: Instead of action=query&prop=extracts it would call >> >> >> >> the >> >> >> >> lead >> >> >> >> request[1], too, since there is a lot of overlap. At least that >> >> >> >> our >> >> >> >> current >> >> >> >> plan. The advantage of that is that the client doesn't need to >> >> >> >> execute >> >> >> >> the >> >> >> >> lead request a second time if the user clicks on the link preview >> >> >> >> (-- >> >> >> >> either >> >> >> >> through caching or app logic.) >> >> >> >> >> >> >> >> So, in the RESTBase case we either want to count the >> >> >> >> mobile-html-sections-lead requests or the >> >> >> >> mobile-html-sections-remaining >> >> >> >> requests depending on what our definition for page views actually >> >> >> >> is. >> >> >> >> We >> >> >> >> could also add a query parameter or extra HTTP header to one of >> >> >> >> the >> >> >> >> mobile-html-sections-lead requests if we need to distinguish >> >> >> >> between >> >> >> >> previews and page views. >> >> >> >> >> >> >> >> Both the current PHP API and the RESTBase based metrics would >> >> >> >> need >> >> >> >> to >> >> >> >> be >> >> >> >> compatible and be collected in parallel since we cannot control >> >> >> >> when >> >> >> >> users >> >> >> >> update their apps. >> >> >> >> >> >> >> >> [1] >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert >> >> >> >> [2] >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Dilbert >> >> >> >> [3] >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_apps >> >> >> >> >> >> >> >> [4] https://phabricator.wikimedia.org/T109383 >> >> >> >> >> >> >> >> >> >> >> >> Cheers, >> >> >> >> >> >> >> >> Bernd >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> >> >> >> Analytics mailing list >> >> >> >> [email protected] >> >> >> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> >> >> >> > >> >> >> > >> >> >> > >> >> >> > -- >> >> >> > Oliver Keyes >> >> >> > Count Logula >> >> >> > Wikimedia Foundation >> >> >> > >> >> >> > _______________________________________________ >> >> >> > Analytics mailing list >> >> >> > [email protected] >> >> >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> >> >> >> >> >> _______________________________________________ >> >> >> Analytics mailing list >> >> >> [email protected] >> >> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > >> >> > >> >> > >> >> > _______________________________________________ >> >> > Analytics mailing list >> >> > [email protected] >> >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > >> >> >> >> >> >> >> >> -- >> >> Oliver Keyes >> >> Count Logula >> >> Wikimedia Foundation >> >> >> >> _______________________________________________ >> >> Analytics mailing list >> >> [email protected] >> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > >> > >> > >> > _______________________________________________ >> > Analytics mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> > >> >> >> >> -- >> Oliver Keyes >> Count Logula >> Wikimedia Foundation >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics > > > > > -- > Gabriel Wicke > Principal Engineer, Wikimedia Foundation > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > -- Oliver Keyes Count Logula Wikimedia Foundation _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
