On 19 August 2015 at 12:29, Bernd Sitzmann <[email protected]> wrote: > Andrew, > > Are you saying the apps have the option to skip providing one of page_title > or page_id? > I hope this is the case since I just came up with a scheme where we could > avoid the second request when a page has only a single section, which we > already get through the first (lead) request. >
Yep; you'd need to provide one or the other, not both. We're actually already looking for sections=0 due precisely to this (that there are two requests for one page) so only including the page_title there should not mess with the continuity of data. > Yes to what Oliver said: The apps don't always know the page_id ahead of > time (only sometimes). The best example where we don't know the page_id > ahead of time is when someone searches for a term on Google search on an > Android device, and gets directed to our Android app. The app only gets the > URL of the page, which we then take to derive the wiki and page_title from. > > Bernd > > On Wed, Aug 19, 2015 at 10:24 AM, Oliver Keyes <[email protected]> wrote: >> >> It'll need to be, some requests don't know pageID in advance, which I >> think was the reason Apps initially didn't implement this. >> >> On 19 August 2015 at 12:19, Andrew Otto <[email protected]> wrote: >> > If your app/site/etc. is creating a request that it wants to count as a >> > pageview, add an X-Analytics header with pageview_id=<page_id> or >> > pageview_title=<page_title> >> > >> > >> > page_id is the current key, so let’s keep that. page_title would be >> > good to >> > have too. Let’s make it an and/or. >> > >> > >> > On Aug 19, 2015, at 12:17, Bernd Sitzmann <[email protected]> wrote: >> > >> >> If your app/site/etc. is creating a request that it wants to count as a >> >> pageview, add an X-Analytics header with pageview_id=<page_id> or >> >> pageview_title=<page_title> >> > >> > >> > Ideally the page id would be the way to go. From a client's perspective >> > I >> > prefer the page title since clients don't always know the page id ahead >> > of >> > time. (We could put that header into the second request of loading the >> > page >> > but I cannot guarantee that we we will always have a second request in >> > the >> > future.) >> > >> > --Cheers, >> > Bernd >> > >> > On Wed, Aug 19, 2015 at 8:53 AM, Dan Andreescu >> > <[email protected]> >> > wrote: >> >> >> >> This (making pageviews proactive) is a great idea, and we should follow >> >> through. Here's a simple start: >> >> >> >> If your app/site/etc. is creating a request that it wants to count as a >> >> pageview, add an X-Analytics header with pageview_id=<page_id> or >> >> pageview_title=<page_title> >> >> >> >> If we can make this change uniformly, I think we'd be in a very good >> >> place. >> >> >> >> On Wed, Aug 19, 2015 at 10:23 AM, Oliver Keyes <[email protected]> >> >> wrote: >> >>> >> >>> On 19 August 2015 at 10:19, Andrew Otto <[email protected]> wrote: >> >>> >> If we /do/ include RESTBase requests we will not only have to >> >>> >> rewrite the pageview definition for the apps to recognise the new >> >>> >> URL >> >>> >> scheme >> >>> > >> >>> > I really think that apps and APIs should do something proactive to >> >>> > tag >> >>> > or log a pageview. With more ways of viewing content, it is going >> >>> > to get >> >>> > harder and harder to maintain a pattern based definition. A >> >>> > pageview should >> >>> > be an event that is logged, not something that is pattern matched >> >>> > out of a >> >>> > very noisy stream of data. >> >>> > >> >>> > Most mediawiki requests do this now, via the page_id field in the >> >>> > X-Analytlics header, but we can’t use this for all pageviews because >> >>> > APIs >> >>> > are more complicated (e.g. more than one page can be served in a >> >>> > single >> >>> > request, etc.). In the longterm, there should be a pageview event >> >>> > stream >> >>> > just like rcstream! :) >> >>> >> >>> This is an excellent point. IIRC we'd been asking Apps to do this for >> >>> kind of a while, so... >> >>> >> >>> > >> >>> > -Ao >> >>> > >> >>> > >> >>> > >> >>> >> On Aug 18, 2015, at 19:58, Oliver Keyes <[email protected]> >> >>> >> wrote: >> >>> >> >> >>> >> On 18 August 2015 at 19:11, Bernd Sitzmann <[email protected]> >> >>> >> wrote: >> >>> >>> This discussion is about needed updates of the definition and >> >>> >>> Analytics >> >>> >>> implementation for mobile apps page view metrics. There is also an >> >>> >>> associated Phab task[4]. Please add the proper Analytics project >> >>> >>> there. >> >>> >>> >> >>> >>> Background / Changes >> >>> >>> >> >>> >>> As you probably remember, the Android app splits a page view into >> >>> >>> two >> >>> >>> requests: one for the lead section and metadata, plus another one >> >>> >>> for >> >>> >>> the >> >>> >>> remainder. >> >>> >>> >> >>> >>> The mobile apps are going to change the way they load pages in two >> >>> >>> different >> >>> >>> ways: >> >>> >>> >> >>> >>> We'll add a link preview when someone clicks on a link from a >> >>> >>> page. >> >>> >>> We're planning on switching over the using RESTBase for loading >> >>> >>> pages >> >>> >>> and >> >>> >>> also the link preview (initially just the Android beta, ater more) >> >>> >>> >> >>> >> >> >>> >> Woah woah woah woah woah. By RESTBase do you mean Gabriel's RESTful >> >>> >> service API? >> >>> >> >> >>> >> Last time I checked that wasn't even consumed by HDFS. Is it now >> >>> >> being >> >>> >> consumed by HDFS? >> >>> >> >> >>> >> More importantly the actual URLs are going to look /totally/ >> >>> >> different. If we do not include RESTBase requests, we will miss the >> >>> >> apps. If we /do/ include RESTBase requests we will not only have to >> >>> >> rewrite the pageview definition for the apps to recognise the new >> >>> >> URL >> >>> >> scheme, we will also potentially have to rewrite every /other/ bit >> >>> >> of >> >>> >> the definition to /not/ incorporate those requests. >> >>> >> >> >>> >> (I use "we" in a collective sense. This isn't my baby any more, >> >>> >> although if Joseph et al want help with the refactor here I'm happy >> >>> >> to >> >>> >> spend my volunteer time on it). >> >>> >> >> >>> >> But basically every other bit of your email is important but now >> >>> >> secondary: this is a potentially massive change, all on its own, >> >>> >> even >> >>> >> without the link preview, even if the substance of the requests >> >>> >> going >> >>> >> to RESTBase were identical. >> >>> >> >> >>> >>> This will have implications for the pageviews definition and how >> >>> >>> we >> >>> >>> count >> >>> >>> user engagement. >> >>> >>> >> >>> >>> The big question is >> >>> >>> >> >>> >>> Should we count link previews as a page view since it's an >> >>> >>> indication >> >>> >>> of >> >>> >>> user engagement? Or should there be a separate metric for link >> >>> >>> previews? >> >>> >>> >> >>> >>> Counting page views >> >>> >>> >> >>> >>> IIRC we currently count action=mobileview§ions=0 query >> >>> >>> parameters >> >>> >>> of >> >>> >>> api.php as a page view. When we publish link previews for all >> >>> >>> Android >> >>> >>> app >> >>> >>> users then we would either want to count also the calls to >> >>> >>> action=query&prop=extracts as a page view or add them to another >> >>> >>> metric. >> >>> >>> >> >>> >>> Once the apps use RESTBase the HTTPS requests will be very >> >>> >>> different: >> >>> >>> >> >>> >>> Page view: Instead of action=mobileview§ions=0 the app would >> >>> >>> call >> >>> >>> the >> >>> >>> RESTBase endpoint for lead request[1] instead of the PHP API >> >>> >>> mentioned >> >>> >>> above. Then it would call [2]. >> >>> >>> Link preview: Instead of action=query&prop=extracts it would call >> >>> >>> the >> >>> >>> lead >> >>> >>> request[1], too, since there is a lot of overlap. At least that >> >>> >>> our >> >>> >>> current >> >>> >>> plan. The advantage of that is that the client doesn't need to >> >>> >>> execute the >> >>> >>> lead request a second time if the user clicks on the link preview >> >>> >>> (-- >> >>> >>> either >> >>> >>> through caching or app logic.) >> >>> >>> >> >>> >>> So, in the RESTBase case we either want to count the >> >>> >>> mobile-html-sections-lead requests or the >> >>> >>> mobile-html-sections-remaining >> >>> >>> requests depending on what our definition for page views actually >> >>> >>> is. >> >>> >>> We >> >>> >>> could also add a query parameter or extra HTTP header to one of >> >>> >>> the >> >>> >>> mobile-html-sections-lead requests if we need to distinguish >> >>> >>> between >> >>> >>> previews and page views. >> >>> >>> >> >>> >>> Both the current PHP API and the RESTBase based metrics would need >> >>> >>> to >> >>> >>> be >> >>> >>> compatible and be collected in parallel since we cannot control >> >>> >>> when >> >>> >>> users >> >>> >>> update their apps. >> >>> >>> >> >>> >>> [1] >> >>> >>> >> >>> >>> >> >>> >>> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert >> >>> >>> [2] >> >>> >>> >> >>> >>> >> >>> >>> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Dilbert >> >>> >>> [3] >> >>> >>> >> >>> >>> >> >>> >>> https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_apps >> >>> >>> >> >>> >>> [4] https://phabricator.wikimedia.org/T109383 >> >>> >>> >> >>> >>> >> >>> >>> Cheers, >> >>> >>> >> >>> >>> Bernd >> >>> >>> >> >>> >>> >> >>> >>> _______________________________________________ >> >>> >>> Analytics mailing list >> >>> >>> [email protected] >> >>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>> >>> >> >>> >> >> >>> >> >> >>> >> >> >>> >> -- >> >>> >> Oliver Keyes >> >>> >> Count Logula >> >>> >> Wikimedia Foundation >> >>> >> >> >>> >> _______________________________________________ >> >>> >> Analytics mailing list >> >>> >> [email protected] >> >>> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >>> > >> >>> > >> >>> > _______________________________________________ >> >>> > Analytics mailing list >> >>> > [email protected] >> >>> > https://lists.wikimedia.org/mailman/listinfo/analytics >> >>> >> >>> >> >>> >> >>> -- >> >>> Oliver Keyes >> >>> Count Logula >> >>> Wikimedia Foundation >> >>> >> >>> _______________________________________________ >> >>> Analytics mailing list >> >>> [email protected] >> >>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> >> >> >> >> _______________________________________________ >> >> Analytics mailing list >> >> [email protected] >> >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> > >> > _______________________________________________ >> > Analytics mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> > >> > >> > >> > _______________________________________________ >> > Analytics mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> > >> >> >> >> -- >> Oliver Keyes >> Count Logula >> Wikimedia Foundation >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics > > > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > -- Oliver Keyes Count Logula Wikimedia Foundation _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
