Ya, we can probably tweak pageview definition to use page_id / page_title if they exist, and only use the rest of the logic if they don’t.
> On Aug 19, 2015, at 12:24, Oliver Keyes <[email protected]> wrote: > > It'll need to be, some requests don't know pageID in advance, which I > think was the reason Apps initially didn't implement this. > > On 19 August 2015 at 12:19, Andrew Otto <[email protected]> wrote: >> If your app/site/etc. is creating a request that it wants to count as a >> pageview, add an X-Analytics header with pageview_id=<page_id> or >> pageview_title=<page_title> >> >> >> page_id is the current key, so let’s keep that. page_title would be good to >> have too. Let’s make it an and/or. >> >> >> On Aug 19, 2015, at 12:17, Bernd Sitzmann <[email protected]> wrote: >> >>> If your app/site/etc. is creating a request that it wants to count as a >>> pageview, add an X-Analytics header with pageview_id=<page_id> or >>> pageview_title=<page_title> >> >> >> Ideally the page id would be the way to go. From a client's perspective I >> prefer the page title since clients don't always know the page id ahead of >> time. (We could put that header into the second request of loading the page >> but I cannot guarantee that we we will always have a second request in the >> future.) >> >> --Cheers, >> Bernd >> >> On Wed, Aug 19, 2015 at 8:53 AM, Dan Andreescu <[email protected]> >> wrote: >>> >>> This (making pageviews proactive) is a great idea, and we should follow >>> through. Here's a simple start: >>> >>> If your app/site/etc. is creating a request that it wants to count as a >>> pageview, add an X-Analytics header with pageview_id=<page_id> or >>> pageview_title=<page_title> >>> >>> If we can make this change uniformly, I think we'd be in a very good >>> place. >>> >>> On Wed, Aug 19, 2015 at 10:23 AM, Oliver Keyes <[email protected]> >>> wrote: >>>> >>>> On 19 August 2015 at 10:19, Andrew Otto <[email protected]> wrote: >>>>>> If we /do/ include RESTBase requests we will not only have to >>>>>> rewrite the pageview definition for the apps to recognise the new URL >>>>>> scheme >>>>> >>>>> I really think that apps and APIs should do something proactive to tag >>>>> or log a pageview. With more ways of viewing content, it is going to get >>>>> harder and harder to maintain a pattern based definition. A pageview >>>>> should >>>>> be an event that is logged, not something that is pattern matched out of a >>>>> very noisy stream of data. >>>>> >>>>> Most mediawiki requests do this now, via the page_id field in the >>>>> X-Analytlics header, but we can’t use this for all pageviews because APIs >>>>> are more complicated (e.g. more than one page can be served in a single >>>>> request, etc.). In the longterm, there should be a pageview event stream >>>>> just like rcstream! :) >>>> >>>> This is an excellent point. IIRC we'd been asking Apps to do this for >>>> kind of a while, so... >>>> >>>>> >>>>> -Ao >>>>> >>>>> >>>>> >>>>>> On Aug 18, 2015, at 19:58, Oliver Keyes <[email protected]> wrote: >>>>>> >>>>>> On 18 August 2015 at 19:11, Bernd Sitzmann <[email protected]> >>>>>> wrote: >>>>>>> This discussion is about needed updates of the definition and >>>>>>> Analytics >>>>>>> implementation for mobile apps page view metrics. There is also an >>>>>>> associated Phab task[4]. Please add the proper Analytics project >>>>>>> there. >>>>>>> >>>>>>> Background / Changes >>>>>>> >>>>>>> As you probably remember, the Android app splits a page view into two >>>>>>> requests: one for the lead section and metadata, plus another one for >>>>>>> the >>>>>>> remainder. >>>>>>> >>>>>>> The mobile apps are going to change the way they load pages in two >>>>>>> different >>>>>>> ways: >>>>>>> >>>>>>> We'll add a link preview when someone clicks on a link from a page. >>>>>>> We're planning on switching over the using RESTBase for loading pages >>>>>>> and >>>>>>> also the link preview (initially just the Android beta, ater more) >>>>>>> >>>>>> >>>>>> Woah woah woah woah woah. By RESTBase do you mean Gabriel's RESTful >>>>>> service API? >>>>>> >>>>>> Last time I checked that wasn't even consumed by HDFS. Is it now being >>>>>> consumed by HDFS? >>>>>> >>>>>> More importantly the actual URLs are going to look /totally/ >>>>>> different. If we do not include RESTBase requests, we will miss the >>>>>> apps. If we /do/ include RESTBase requests we will not only have to >>>>>> rewrite the pageview definition for the apps to recognise the new URL >>>>>> scheme, we will also potentially have to rewrite every /other/ bit of >>>>>> the definition to /not/ incorporate those requests. >>>>>> >>>>>> (I use "we" in a collective sense. This isn't my baby any more, >>>>>> although if Joseph et al want help with the refactor here I'm happy to >>>>>> spend my volunteer time on it). >>>>>> >>>>>> But basically every other bit of your email is important but now >>>>>> secondary: this is a potentially massive change, all on its own, even >>>>>> without the link preview, even if the substance of the requests going >>>>>> to RESTBase were identical. >>>>>> >>>>>>> This will have implications for the pageviews definition and how we >>>>>>> count >>>>>>> user engagement. >>>>>>> >>>>>>> The big question is >>>>>>> >>>>>>> Should we count link previews as a page view since it's an indication >>>>>>> of >>>>>>> user engagement? Or should there be a separate metric for link >>>>>>> previews? >>>>>>> >>>>>>> Counting page views >>>>>>> >>>>>>> IIRC we currently count action=mobileview§ions=0 query parameters >>>>>>> of >>>>>>> api.php as a page view. When we publish link previews for all Android >>>>>>> app >>>>>>> users then we would either want to count also the calls to >>>>>>> action=query&prop=extracts as a page view or add them to another >>>>>>> metric. >>>>>>> >>>>>>> Once the apps use RESTBase the HTTPS requests will be very different: >>>>>>> >>>>>>> Page view: Instead of action=mobileview§ions=0 the app would call >>>>>>> the >>>>>>> RESTBase endpoint for lead request[1] instead of the PHP API >>>>>>> mentioned >>>>>>> above. Then it would call [2]. >>>>>>> Link preview: Instead of action=query&prop=extracts it would call the >>>>>>> lead >>>>>>> request[1], too, since there is a lot of overlap. At least that our >>>>>>> current >>>>>>> plan. The advantage of that is that the client doesn't need to >>>>>>> execute the >>>>>>> lead request a second time if the user clicks on the link preview (-- >>>>>>> either >>>>>>> through caching or app logic.) >>>>>>> >>>>>>> So, in the RESTBase case we either want to count the >>>>>>> mobile-html-sections-lead requests or the >>>>>>> mobile-html-sections-remaining >>>>>>> requests depending on what our definition for page views actually is. >>>>>>> We >>>>>>> could also add a query parameter or extra HTTP header to one of the >>>>>>> mobile-html-sections-lead requests if we need to distinguish between >>>>>>> previews and page views. >>>>>>> >>>>>>> Both the current PHP API and the RESTBase based metrics would need to >>>>>>> be >>>>>>> compatible and be collected in parallel since we cannot control when >>>>>>> users >>>>>>> update their apps. >>>>>>> >>>>>>> [1] >>>>>>> >>>>>>> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert >>>>>>> [2] >>>>>>> >>>>>>> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Dilbert >>>>>>> [3] >>>>>>> >>>>>>> https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_apps >>>>>>> >>>>>>> [4] https://phabricator.wikimedia.org/T109383 >>>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Bernd >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Analytics mailing list >>>>>>> [email protected] >>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Oliver Keyes >>>>>> Count Logula >>>>>> Wikimedia Foundation >>>>>> >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>>> >>>> -- >>>> Oliver Keyes >>>> Count Logula >>>> Wikimedia Foundation >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > > -- > Oliver Keyes > Count Logula > Wikimedia Foundation > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
