Sounds sensible enough as long as we're sure they're only being passed
through for "real" pageviews right now. (I can't imagine a situation
in which they wouldn't be but someone should run that query)

On 19 August 2015 at 12:27, Andrew Otto <[email protected]> wrote:
> Ya, we can probably tweak pageview definition to use page_id / page_title if 
> they exist, and only use the rest of the logic if they don’t.
>
>
>> On Aug 19, 2015, at 12:24, Oliver Keyes <[email protected]> wrote:
>>
>> It'll need to be, some requests don't know pageID in advance, which I
>> think was the reason Apps initially didn't implement this.
>>
>> On 19 August 2015 at 12:19, Andrew Otto <[email protected]> wrote:
>>> If your app/site/etc. is creating a request that it wants to count as a
>>> pageview, add an X-Analytics header with pageview_id=<page_id> or
>>> pageview_title=<page_title>
>>>
>>>
>>> page_id is the current key, so let’s keep that.  page_title would be good to
>>> have too.  Let’s make it an and/or.
>>>
>>>
>>> On Aug 19, 2015, at 12:17, Bernd Sitzmann <[email protected]> wrote:
>>>
>>>> If your app/site/etc. is creating a request that it wants to count as a
>>>> pageview, add an X-Analytics header with pageview_id=<page_id> or
>>>> pageview_title=<page_title>
>>>
>>>
>>> Ideally the page id would be the way to go. From a client's perspective I
>>> prefer the page title since clients don't always know the page id ahead of
>>> time. (We could put that header into the second request of loading the page
>>> but I cannot guarantee that we we will always have a second request in the
>>> future.)
>>>
>>> --Cheers,
>>> Bernd
>>>
>>> On Wed, Aug 19, 2015 at 8:53 AM, Dan Andreescu <[email protected]>
>>> wrote:
>>>>
>>>> This (making pageviews proactive) is a great idea, and we should follow
>>>> through.  Here's a simple start:
>>>>
>>>> If your app/site/etc. is creating a request that it wants to count as a
>>>> pageview, add an X-Analytics header with pageview_id=<page_id> or
>>>> pageview_title=<page_title>
>>>>
>>>> If we can make this change uniformly, I think we'd be in a very good
>>>> place.
>>>>
>>>> On Wed, Aug 19, 2015 at 10:23 AM, Oliver Keyes <[email protected]>
>>>> wrote:
>>>>>
>>>>> On 19 August 2015 at 10:19, Andrew Otto <[email protected]> wrote:
>>>>>>> If we /do/ include RESTBase requests we will not only have to
>>>>>>> rewrite the pageview definition for the apps to recognise the new URL
>>>>>>> scheme
>>>>>>
>>>>>> I really think that apps and APIs should do something proactive to tag
>>>>>> or log a pageview.  With more ways of viewing content, it is going to get
>>>>>> harder and harder to maintain a pattern based definition.  A pageview 
>>>>>> should
>>>>>> be an event that is logged, not something that is pattern matched out of 
>>>>>> a
>>>>>> very noisy stream of data.
>>>>>>
>>>>>> Most mediawiki requests do this now, via the page_id field in the
>>>>>> X-Analytlics header, but we can’t use this for all pageviews because APIs
>>>>>> are more complicated (e.g. more than one page can be served in a single
>>>>>> request, etc.).  In the longterm, there should be a pageview event stream
>>>>>> just like rcstream! :)
>>>>>
>>>>> This is an excellent point. IIRC we'd been asking Apps to do this for
>>>>> kind of a while, so...
>>>>>
>>>>>>
>>>>>> -Ao
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Aug 18, 2015, at 19:58, Oliver Keyes <[email protected]> wrote:
>>>>>>>
>>>>>>> On 18 August 2015 at 19:11, Bernd Sitzmann <[email protected]>
>>>>>>> wrote:
>>>>>>>> This discussion is about needed updates of the definition and
>>>>>>>> Analytics
>>>>>>>> implementation for mobile apps page view metrics. There is also an
>>>>>>>> associated Phab task[4]. Please add the proper Analytics project
>>>>>>>> there.
>>>>>>>>
>>>>>>>> Background / Changes
>>>>>>>>
>>>>>>>> As you probably remember, the Android app splits a page view into two
>>>>>>>> requests: one for the lead section and metadata, plus another one for
>>>>>>>> the
>>>>>>>> remainder.
>>>>>>>>
>>>>>>>> The mobile apps are going to change the way they load pages in two
>>>>>>>> different
>>>>>>>> ways:
>>>>>>>>
>>>>>>>> We'll add a link preview when someone clicks on a link from a page.
>>>>>>>> We're planning on switching over the using RESTBase for loading pages
>>>>>>>> and
>>>>>>>> also the link preview (initially just the Android beta, ater more)
>>>>>>>>
>>>>>>>
>>>>>>> Woah woah woah woah woah. By RESTBase do you mean Gabriel's RESTful
>>>>>>> service API?
>>>>>>>
>>>>>>> Last time I checked that wasn't even consumed by HDFS. Is it now being
>>>>>>> consumed by HDFS?
>>>>>>>
>>>>>>> More importantly the actual URLs are going to look /totally/
>>>>>>> different. If we do not include RESTBase requests, we will miss the
>>>>>>> apps. If we /do/ include RESTBase requests we will not only have to
>>>>>>> rewrite the pageview definition for the apps to recognise the new URL
>>>>>>> scheme, we will also potentially have to rewrite every /other/ bit of
>>>>>>> the definition to /not/ incorporate those requests.
>>>>>>>
>>>>>>> (I use "we" in a collective sense. This isn't my baby any more,
>>>>>>> although if Joseph et al want help with the refactor here I'm happy to
>>>>>>> spend my volunteer time on it).
>>>>>>>
>>>>>>> But basically every other bit of your email is important but now
>>>>>>> secondary: this is a potentially massive change, all on its own, even
>>>>>>> without the link preview, even if the substance of the requests going
>>>>>>> to RESTBase were identical.
>>>>>>>
>>>>>>>> This will have implications for the pageviews definition and how we
>>>>>>>> count
>>>>>>>> user engagement.
>>>>>>>>
>>>>>>>> The big question is
>>>>>>>>
>>>>>>>> Should we count link previews as a page view since it's an indication
>>>>>>>> of
>>>>>>>> user engagement? Or should there be a separate metric for link
>>>>>>>> previews?
>>>>>>>>
>>>>>>>> Counting page views
>>>>>>>>
>>>>>>>> IIRC we currently count action=mobileview&sections=0 query parameters
>>>>>>>> of
>>>>>>>> api.php as a page view. When we publish link previews for all Android
>>>>>>>> app
>>>>>>>> users then we would either want to count also the calls to
>>>>>>>> action=query&prop=extracts as a page view or add them to another
>>>>>>>> metric.
>>>>>>>>
>>>>>>>> Once the apps use RESTBase the HTTPS requests will be very different:
>>>>>>>>
>>>>>>>> Page view: Instead of action=mobileview&sections=0 the app would call
>>>>>>>> the
>>>>>>>> RESTBase endpoint for lead request[1] instead of the PHP API
>>>>>>>> mentioned
>>>>>>>> above. Then it would call [2].
>>>>>>>> Link preview: Instead of action=query&prop=extracts it would call the
>>>>>>>> lead
>>>>>>>> request[1], too, since there is a lot of overlap. At least that our
>>>>>>>> current
>>>>>>>> plan. The advantage of that is that the client doesn't need to
>>>>>>>> execute the
>>>>>>>> lead request a second time if the user clicks on the link preview (--
>>>>>>>> either
>>>>>>>> through caching or app logic.)
>>>>>>>>
>>>>>>>> So, in the RESTBase case we either want to count the
>>>>>>>> mobile-html-sections-lead requests or the
>>>>>>>> mobile-html-sections-remaining
>>>>>>>> requests depending on what our definition for page views actually is.
>>>>>>>> We
>>>>>>>> could also add a query parameter or extra HTTP header to one of the
>>>>>>>> mobile-html-sections-lead requests if we need to distinguish between
>>>>>>>> previews and page views.
>>>>>>>>
>>>>>>>> Both the current PHP API and the RESTBase based metrics would need to
>>>>>>>> be
>>>>>>>> compatible and be collected in parallel since we cannot control when
>>>>>>>> users
>>>>>>>> update their apps.
>>>>>>>>
>>>>>>>> [1]
>>>>>>>>
>>>>>>>> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-lead/Dilbert
>>>>>>>> [2]
>>>>>>>>
>>>>>>>> https://en.wikipedia.org/api/rest_v1/page/mobile-html-sections-remaining/Dilbert
>>>>>>>> [3]
>>>>>>>>
>>>>>>>> https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/RESTBase_services_for_apps
>>>>>>>>
>>>>>>>> [4] https://phabricator.wikimedia.org/T109383
>>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> Bernd
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Analytics mailing list
>>>>>>>> [email protected]
>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Oliver Keyes
>>>>>>> Count Logula
>>>>>>> Wikimedia Foundation
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Analytics mailing list
>>>>>>> [email protected]
>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Analytics mailing list
>>>>>> [email protected]
>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Oliver Keyes
>>>>> Count Logula
>>>>> Wikimedia Foundation
>>>>>
>>>>> _______________________________________________
>>>>> Analytics mailing list
>>>>> [email protected]
>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Analytics mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>
>>
>>
>> --
>> Oliver Keyes
>> Count Logula
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics



-- 
Oliver Keyes
Count Logula
Wikimedia Foundation

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to