Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-15 Thread Diederik van Liere
Thanks Asher for tying this up! I was about to write a similar email :)
One final question, just to make sure we are all on the same page: is the
X-CS field becoming a generic key/value pair for tracking purposes?

D


On Fri, Feb 15, 2013 at 11:16 AM, Asher Feldman wrote:

> Just to tie this thread up - the issue of how to count ajax driven
> pageviews loaded from the api and of how to differentiate those requests
> from secondary api page requests has been resolved without the need for
> code or logging changes.
>
> Tagging of the mobile beta site will be accomplished via a new generic
> mediawiki http response header dedicated to logging containing key value
> pairs.
>
> -Asher
>
> On Tue, Feb 12, 2013 at 9:56 AM, Asher Feldman  >wrote:
>
> > On Tuesday, February 12, 2013, Diederik van Liere wrote:
> >
> >> > It does still seem to me that the data to determine secondary api
> >> requests
> >> > should already be present in the existing log line. If the value of
> the
> >> > page param in an action=mobileview api request matches the page in the
> >> > referrer (perhaps with normalization), it's a secondary request as per
> >> case
> >> > 1 below.  Otherwise, it's a pageview as per case 2.  Difficult or
> >> expensive
> >> > to reconcile?  Not when you're doing distributed log analysis via
> >> hadoop.
> >> >
> >> So I did look into this prior to writing the RFC and the issue is that a
> >> lot of API referrers don't contain the querystring. I don't know what
> >> triggers this so if we can fix this then we can definitely derive the
> >> secondary pageview request from the referrer field.
> >> D
> >
> >
> > If you can point me to some examples, I'll see if I can find any insights
> > into the behavior.
> >
> >
> >>
> >> > On Mon, Feb 11, 2013 at 7:11 PM, Arthur Richards <
> >> aricha...@wikimedia.org
> >> > >wrote:
> >> >
> >> > > Thanks, Jon. To try and clarify a bit more about the API requests...
> >> they
> >> > > are not made on a per-section basis. As I mentioned earlier, there
> are
> >> > two
> >> > > cases in which article content gets loaded by the API:
> >> > >
> >> > > 1) Going directly to a page (eg clicking a link from a Google
> search)
> >> > will
> >> > > result in the backend serving a page with ONLY summary section
> content
> >> > and
> >> > > section headers. The rest of the page is lazily loaded via API
> request
> >> > once
> >> > > the JS for the page gets loaded. The idea is to increase
> >> responsiveness
> >> > by
> >> > > reducing the delay for an article to load (further details in the
> >> article
> >> > > Jon previously linked to). The API request looks like:
> >> > >
> >> > >
> >> >
> >>
> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all
> >> > >
> >> > > 2) Loading an article entirely via Javascript - like when a link is
> >> > clicked
> >> > > in an article to another article, or an article is loaded via
> search.
> >> > This
> >> > > will make ONE call to the API to load article content. API request
> >> looks
> >> > > like:
> >> > >
> >> > >
> >> >
> >>
> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all
> >> > >
> >> > > These API requests are identical, but only #2 should be counted as a
> >> > > 'pageview' - #1 is a secondary API request and should not be counted
> >> as a
> >> > > 'pageview'. You could make the argument that we just count all of
> >> these
> >> > API
> >> > > requests as pageviews, but there are cases when we can't load
> article
> >> > > content from the API (like devices that do not support JS), so we
> >> need to
> >> > > be able to count the traditional page request as a pageview - thus
> we
> >> > need
> >> > > a way to differentiate the types of API requests being made when
> they
> >> > > otherwise share the same URL.
> >> > >
> >> > >
> >> > >
> >> > > On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson 
> >> wrote:
> >> > >
> >> > > > I'm a bit worried that now we are asking why pages are lazy loaded
> >> > > > rather than focusing on the fact that they currently __are doing
> >> > > > this___ and how we can log these (if we want to discuss this
> further
> >> > > > let's start another thread as I'm getting extremely confused doing
> >> so
> >> > > > on this one).
> >> > > >
> >> > > > Lazy loading sections
> >> > > > 
> >> > > > For motivation behind moving MobileFrontend into the direction of
> >> lazy
> >> > > > loading section content and subsequent pages can be found here
> [1],
> >> I
> >> > > > just gave it a refresh as it was a little out of date.
> >> > > >
> >> > > > In summary the reason is to
> >> > > > 1) make the app feel more responsive by simply loading content
> >> rather
> >> > > > than reloading the entire

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-15 Thread Asher Feldman
Just to tie this thread up - the issue of how to count ajax driven
pageviews loaded from the api and of how to differentiate those requests
from secondary api page requests has been resolved without the need for
code or logging changes.

Tagging of the mobile beta site will be accomplished via a new generic
mediawiki http response header dedicated to logging containing key value
pairs.

-Asher

On Tue, Feb 12, 2013 at 9:56 AM, Asher Feldman wrote:

> On Tuesday, February 12, 2013, Diederik van Liere wrote:
>
>> > It does still seem to me that the data to determine secondary api
>> requests
>> > should already be present in the existing log line. If the value of the
>> > page param in an action=mobileview api request matches the page in the
>> > referrer (perhaps with normalization), it's a secondary request as per
>> case
>> > 1 below.  Otherwise, it's a pageview as per case 2.  Difficult or
>> expensive
>> > to reconcile?  Not when you're doing distributed log analysis via
>> hadoop.
>> >
>> So I did look into this prior to writing the RFC and the issue is that a
>> lot of API referrers don't contain the querystring. I don't know what
>> triggers this so if we can fix this then we can definitely derive the
>> secondary pageview request from the referrer field.
>> D
>
>
> If you can point me to some examples, I'll see if I can find any insights
> into the behavior.
>
>
>>
>> > On Mon, Feb 11, 2013 at 7:11 PM, Arthur Richards <
>> aricha...@wikimedia.org
>> > >wrote:
>> >
>> > > Thanks, Jon. To try and clarify a bit more about the API requests...
>> they
>> > > are not made on a per-section basis. As I mentioned earlier, there are
>> > two
>> > > cases in which article content gets loaded by the API:
>> > >
>> > > 1) Going directly to a page (eg clicking a link from a Google search)
>> > will
>> > > result in the backend serving a page with ONLY summary section content
>> > and
>> > > section headers. The rest of the page is lazily loaded via API request
>> > once
>> > > the JS for the page gets loaded. The idea is to increase
>> responsiveness
>> > by
>> > > reducing the delay for an article to load (further details in the
>> article
>> > > Jon previously linked to). The API request looks like:
>> > >
>> > >
>> >
>> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all
>> > >
>> > > 2) Loading an article entirely via Javascript - like when a link is
>> > clicked
>> > > in an article to another article, or an article is loaded via search.
>> > This
>> > > will make ONE call to the API to load article content. API request
>> looks
>> > > like:
>> > >
>> > >
>> >
>> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all
>> > >
>> > > These API requests are identical, but only #2 should be counted as a
>> > > 'pageview' - #1 is a secondary API request and should not be counted
>> as a
>> > > 'pageview'. You could make the argument that we just count all of
>> these
>> > API
>> > > requests as pageviews, but there are cases when we can't load article
>> > > content from the API (like devices that do not support JS), so we
>> need to
>> > > be able to count the traditional page request as a pageview - thus we
>> > need
>> > > a way to differentiate the types of API requests being made when they
>> > > otherwise share the same URL.
>> > >
>> > >
>> > >
>> > > On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson 
>> wrote:
>> > >
>> > > > I'm a bit worried that now we are asking why pages are lazy loaded
>> > > > rather than focusing on the fact that they currently __are doing
>> > > > this___ and how we can log these (if we want to discuss this further
>> > > > let's start another thread as I'm getting extremely confused doing
>> so
>> > > > on this one).
>> > > >
>> > > > Lazy loading sections
>> > > > 
>> > > > For motivation behind moving MobileFrontend into the direction of
>> lazy
>> > > > loading section content and subsequent pages can be found here [1],
>> I
>> > > > just gave it a refresh as it was a little out of date.
>> > > >
>> > > > In summary the reason is to
>> > > > 1) make the app feel more responsive by simply loading content
>> rather
>> > > > than reloading the entire interface
>> > > > 2) reducing the payload sent to a device.
>> > > >
>> > > > Session Tracking
>> > > > 
>> > > >
>> > > > Going back to the discussion of tracking mobile page views, it
>> sounds
>> > > > like a header stating whether a page is being viewed in alpha, beta
>> or
>> > > > stable works fine for standard page views.
>> > > >
>> > > > As for the situations where an entire page is loaded via the api it
>> > > > makes no dif
>
>
___
Wikitech-l mailing l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-12 Thread Asher Feldman
On Tuesday, February 12, 2013, Diederik van Liere wrote:

> > It does still seem to me that the data to determine secondary api
> requests
> > should already be present in the existing log line. If the value of the
> > page param in an action=mobileview api request matches the page in the
> > referrer (perhaps with normalization), it's a secondary request as per
> case
> > 1 below.  Otherwise, it's a pageview as per case 2.  Difficult or
> expensive
> > to reconcile?  Not when you're doing distributed log analysis via hadoop.
> >
> So I did look into this prior to writing the RFC and the issue is that a
> lot of API referrers don't contain the querystring. I don't know what
> triggers this so if we can fix this then we can definitely derive the
> secondary pageview request from the referrer field.
> D


If you can point me to some examples, I'll see if I can find any insights
into the behavior.


>
> > On Mon, Feb 11, 2013 at 7:11 PM, Arthur Richards <
> aricha...@wikimedia.org
> > >wrote:
> >
> > > Thanks, Jon. To try and clarify a bit more about the API requests...
> they
> > > are not made on a per-section basis. As I mentioned earlier, there are
> > two
> > > cases in which article content gets loaded by the API:
> > >
> > > 1) Going directly to a page (eg clicking a link from a Google search)
> > will
> > > result in the backend serving a page with ONLY summary section content
> > and
> > > section headers. The rest of the page is lazily loaded via API request
> > once
> > > the JS for the page gets loaded. The idea is to increase responsiveness
> > by
> > > reducing the delay for an article to load (further details in the
> article
> > > Jon previously linked to). The API request looks like:
> > >
> > >
> >
> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all
> > >
> > > 2) Loading an article entirely via Javascript - like when a link is
> > clicked
> > > in an article to another article, or an article is loaded via search.
> > This
> > > will make ONE call to the API to load article content. API request
> looks
> > > like:
> > >
> > >
> >
> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all
> > >
> > > These API requests are identical, but only #2 should be counted as a
> > > 'pageview' - #1 is a secondary API request and should not be counted
> as a
> > > 'pageview'. You could make the argument that we just count all of these
> > API
> > > requests as pageviews, but there are cases when we can't load article
> > > content from the API (like devices that do not support JS), so we need
> to
> > > be able to count the traditional page request as a pageview - thus we
> > need
> > > a way to differentiate the types of API requests being made when they
> > > otherwise share the same URL.
> > >
> > >
> > >
> > > On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson 
> wrote:
> > >
> > > > I'm a bit worried that now we are asking why pages are lazy loaded
> > > > rather than focusing on the fact that they currently __are doing
> > > > this___ and how we can log these (if we want to discuss this further
> > > > let's start another thread as I'm getting extremely confused doing so
> > > > on this one).
> > > >
> > > > Lazy loading sections
> > > > 
> > > > For motivation behind moving MobileFrontend into the direction of
> lazy
> > > > loading section content and subsequent pages can be found here [1], I
> > > > just gave it a refresh as it was a little out of date.
> > > >
> > > > In summary the reason is to
> > > > 1) make the app feel more responsive by simply loading content rather
> > > > than reloading the entire interface
> > > > 2) reducing the payload sent to a device.
> > > >
> > > > Session Tracking
> > > > 
> > > >
> > > > Going back to the discussion of tracking mobile page views, it sounds
> > > > like a header stating whether a page is being viewed in alpha, beta
> or
> > > > stable works fine for standard page views.
> > > >
> > > > As for the situations where an entire page is loaded via the api it
> > > > makes no dif
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-12 Thread Diederik van Liere
> It does still seem to me that the data to determine secondary api requests
> should already be present in the existing log line. If the value of the
> page param in an action=mobileview api request matches the page in the
> referrer (perhaps with normalization), it's a secondary request as per case
> 1 below.  Otherwise, it's a pageview as per case 2.  Difficult or expensive
> to reconcile?  Not when you're doing distributed log analysis via hadoop.
>
So I did look into this prior to writing the RFC and the issue is that a
lot of API referrers don't contain the querystring. I don't know what
triggers this so if we can fix this then we can definitely derive the
secondary pageview request from the referrer field.
D



> On Mon, Feb 11, 2013 at 7:11 PM, Arthur Richards  >wrote:
>
> > Thanks, Jon. To try and clarify a bit more about the API requests... they
> > are not made on a per-section basis. As I mentioned earlier, there are
> two
> > cases in which article content gets loaded by the API:
> >
> > 1) Going directly to a page (eg clicking a link from a Google search)
> will
> > result in the backend serving a page with ONLY summary section content
> and
> > section headers. The rest of the page is lazily loaded via API request
> once
> > the JS for the page gets loaded. The idea is to increase responsiveness
> by
> > reducing the delay for an article to load (further details in the article
> > Jon previously linked to). The API request looks like:
> >
> >
> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all
> >
> > 2) Loading an article entirely via Javascript - like when a link is
> clicked
> > in an article to another article, or an article is loaded via search.
> This
> > will make ONE call to the API to load article content. API request looks
> > like:
> >
> >
> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all
> >
> > These API requests are identical, but only #2 should be counted as a
> > 'pageview' - #1 is a secondary API request and should not be counted as a
> > 'pageview'. You could make the argument that we just count all of these
> API
> > requests as pageviews, but there are cases when we can't load article
> > content from the API (like devices that do not support JS), so we need to
> > be able to count the traditional page request as a pageview - thus we
> need
> > a way to differentiate the types of API requests being made when they
> > otherwise share the same URL.
> >
> >
> >
> > On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson  wrote:
> >
> > > I'm a bit worried that now we are asking why pages are lazy loaded
> > > rather than focusing on the fact that they currently __are doing
> > > this___ and how we can log these (if we want to discuss this further
> > > let's start another thread as I'm getting extremely confused doing so
> > > on this one).
> > >
> > > Lazy loading sections
> > > 
> > > For motivation behind moving MobileFrontend into the direction of lazy
> > > loading section content and subsequent pages can be found here [1], I
> > > just gave it a refresh as it was a little out of date.
> > >
> > > In summary the reason is to
> > > 1) make the app feel more responsive by simply loading content rather
> > > than reloading the entire interface
> > > 2) reducing the payload sent to a device.
> > >
> > > Session Tracking
> > > 
> > >
> > > Going back to the discussion of tracking mobile page views, it sounds
> > > like a header stating whether a page is being viewed in alpha, beta or
> > > stable works fine for standard page views.
> > >
> > > As for the situations where an entire page is loaded via the api it
> > > makes no difference to us to whether we
> > > 1) send the same header (set via javascript) or
> > > 2) add a query string parameter.
> > >
> > > The only advantage I can see of using a header is that an initial page
> > > load of the article San Francisco currently uses the same api url as a
> > > page load of the article San Francisco via javascript (e.g. I click a
> > > link to 'San Francisco' on the California article).
> > >
> > > In this new method they would use different urls (as the data sent is
> > > different). I'm not sure how that would effect caching.
> > >
> > > Let us know which method is preferred. From my perspective
> > > implementation of either is easy.
> > >
> > > [1] http://www.mediawiki.org/wiki/MobileFrontend/Dynamic_Sections
> > >
> > > On Mon, Feb 11, 2013 at 12:50 PM, Asher Feldman <
> afeld...@wikimedia.org>
> > > wrote:
> > > > Max - good answers re: caching concerns.  That leaves studying if the
> > > bytes
> > > > transferred on average mobile article view increases or decreases
> with
> > > lazy
> > > 

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-11 Thread Asher Feldman
Thanks for the clarification Arthur, that clears up some misconceptions I
had.  I saw a demo around the allstaff where individual sections were lazy
loaded, so I think I had that in my head.

It does still seem to me that the data to determine secondary api requests
should already be present in the existing log line. If the value of the
page param in an action=mobileview api request matches the page in the
referrer (perhaps with normalization), it's a secondary request as per case
1 below.  Otherwise, it's a pageview as per case 2.  Difficult or expensive
to reconcile?  Not when you're doing distributed log analysis via hadoop.

On Mon, Feb 11, 2013 at 7:11 PM, Arthur Richards wrote:

> Thanks, Jon. To try and clarify a bit more about the API requests... they
> are not made on a per-section basis. As I mentioned earlier, there are two
> cases in which article content gets loaded by the API:
>
> 1) Going directly to a page (eg clicking a link from a Google search) will
> result in the backend serving a page with ONLY summary section content and
> section headers. The rest of the page is lazily loaded via API request once
> the JS for the page gets loaded. The idea is to increase responsiveness by
> reducing the delay for an article to load (further details in the article
> Jon previously linked to). The API request looks like:
>
> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all
>
> 2) Loading an article entirely via Javascript - like when a link is clicked
> in an article to another article, or an article is loaded via search. This
> will make ONE call to the API to load article content. API request looks
> like:
>
> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all
>
> These API requests are identical, but only #2 should be counted as a
> 'pageview' - #1 is a secondary API request and should not be counted as a
> 'pageview'. You could make the argument that we just count all of these API
> requests as pageviews, but there are cases when we can't load article
> content from the API (like devices that do not support JS), so we need to
> be able to count the traditional page request as a pageview - thus we need
> a way to differentiate the types of API requests being made when they
> otherwise share the same URL.
>
>
>
> On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson  wrote:
>
> > I'm a bit worried that now we are asking why pages are lazy loaded
> > rather than focusing on the fact that they currently __are doing
> > this___ and how we can log these (if we want to discuss this further
> > let's start another thread as I'm getting extremely confused doing so
> > on this one).
> >
> > Lazy loading sections
> > 
> > For motivation behind moving MobileFrontend into the direction of lazy
> > loading section content and subsequent pages can be found here [1], I
> > just gave it a refresh as it was a little out of date.
> >
> > In summary the reason is to
> > 1) make the app feel more responsive by simply loading content rather
> > than reloading the entire interface
> > 2) reducing the payload sent to a device.
> >
> > Session Tracking
> > 
> >
> > Going back to the discussion of tracking mobile page views, it sounds
> > like a header stating whether a page is being viewed in alpha, beta or
> > stable works fine for standard page views.
> >
> > As for the situations where an entire page is loaded via the api it
> > makes no difference to us to whether we
> > 1) send the same header (set via javascript) or
> > 2) add a query string parameter.
> >
> > The only advantage I can see of using a header is that an initial page
> > load of the article San Francisco currently uses the same api url as a
> > page load of the article San Francisco via javascript (e.g. I click a
> > link to 'San Francisco' on the California article).
> >
> > In this new method they would use different urls (as the data sent is
> > different). I'm not sure how that would effect caching.
> >
> > Let us know which method is preferred. From my perspective
> > implementation of either is easy.
> >
> > [1] http://www.mediawiki.org/wiki/MobileFrontend/Dynamic_Sections
> >
> > On Mon, Feb 11, 2013 at 12:50 PM, Asher Feldman 
> > wrote:
> > > Max - good answers re: caching concerns.  That leaves studying if the
> > bytes
> > > transferred on average mobile article view increases or decreases with
> > lazy
> > > section loading.  If it increases, I'd say this isn't a positive
> > direction
> > > to go in and stop there.  If it decreases, then we should look at the
> > > effect on total latency, number of requests required per pageview, and
> > the
> > > impact on backend apache utilization which I'd expect to be > 0

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-11 Thread Arthur Richards
Thanks, Jon. To try and clarify a bit more about the API requests... they
are not made on a per-section basis. As I mentioned earlier, there are two
cases in which article content gets loaded by the API:

1) Going directly to a page (eg clicking a link from a Google search) will
result in the backend serving a page with ONLY summary section content and
section headers. The rest of the page is lazily loaded via API request once
the JS for the page gets loaded. The idea is to increase responsiveness by
reducing the delay for an article to load (further details in the article
Jon previously linked to). The API request looks like:
http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all

2) Loading an article entirely via Javascript - like when a link is clicked
in an article to another article, or an article is loaded via search. This
will make ONE call to the API to load article content. API request looks
like:
http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all

These API requests are identical, but only #2 should be counted as a
'pageview' - #1 is a secondary API request and should not be counted as a
'pageview'. You could make the argument that we just count all of these API
requests as pageviews, but there are cases when we can't load article
content from the API (like devices that do not support JS), so we need to
be able to count the traditional page request as a pageview - thus we need
a way to differentiate the types of API requests being made when they
otherwise share the same URL.



On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson  wrote:

> I'm a bit worried that now we are asking why pages are lazy loaded
> rather than focusing on the fact that they currently __are doing
> this___ and how we can log these (if we want to discuss this further
> let's start another thread as I'm getting extremely confused doing so
> on this one).
>
> Lazy loading sections
> 
> For motivation behind moving MobileFrontend into the direction of lazy
> loading section content and subsequent pages can be found here [1], I
> just gave it a refresh as it was a little out of date.
>
> In summary the reason is to
> 1) make the app feel more responsive by simply loading content rather
> than reloading the entire interface
> 2) reducing the payload sent to a device.
>
> Session Tracking
> 
>
> Going back to the discussion of tracking mobile page views, it sounds
> like a header stating whether a page is being viewed in alpha, beta or
> stable works fine for standard page views.
>
> As for the situations where an entire page is loaded via the api it
> makes no difference to us to whether we
> 1) send the same header (set via javascript) or
> 2) add a query string parameter.
>
> The only advantage I can see of using a header is that an initial page
> load of the article San Francisco currently uses the same api url as a
> page load of the article San Francisco via javascript (e.g. I click a
> link to 'San Francisco' on the California article).
>
> In this new method they would use different urls (as the data sent is
> different). I'm not sure how that would effect caching.
>
> Let us know which method is preferred. From my perspective
> implementation of either is easy.
>
> [1] http://www.mediawiki.org/wiki/MobileFrontend/Dynamic_Sections
>
> On Mon, Feb 11, 2013 at 12:50 PM, Asher Feldman 
> wrote:
> > Max - good answers re: caching concerns.  That leaves studying if the
> bytes
> > transferred on average mobile article view increases or decreases with
> lazy
> > section loading.  If it increases, I'd say this isn't a positive
> direction
> > to go in and stop there.  If it decreases, then we should look at the
> > effect on total latency, number of requests required per pageview, and
> the
> > impact on backend apache utilization which I'd expect to be > 0.
> >
> > Does the mobile team have specific goals that this project aims to
> > accomplish?  If so, we can use those as the measure against which to
> > compare an impact analysis.
> >
> > On Mon, Feb 11, 2013 at 12:21 PM, Max Semenik 
> wrote:
> >
> >> On 11.02.2013, 22:11 Asher wrote:
> >>
> >> > And then I'd wonder about the server side implementation. How will
> >> frontend
> >> > cache invalidation work? Are we going to need to purge every
> individual
> >> > article section relative to /w/api.php on edit?
> >>
> >> Since the API doesn't require pretty URLs, we could simply append the
> >> current revision ID to the mobileview URLs.
> >>
> >> > Article HTML in memcached
> >> > (parser cache), mobile processed HTML in memcached.. Now individual
> >> > sections in memcached? If so, should we calculate memcached space
> needs
> >> for
> >> > arti

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-11 Thread Jon Robson
I'm a bit worried that now we are asking why pages are lazy loaded
rather than focusing on the fact that they currently __are doing
this___ and how we can log these (if we want to discuss this further
let's start another thread as I'm getting extremely confused doing so
on this one).

Lazy loading sections

For motivation behind moving MobileFrontend into the direction of lazy
loading section content and subsequent pages can be found here [1], I
just gave it a refresh as it was a little out of date.

In summary the reason is to
1) make the app feel more responsive by simply loading content rather
than reloading the entire interface
2) reducing the payload sent to a device.

Session Tracking


Going back to the discussion of tracking mobile page views, it sounds
like a header stating whether a page is being viewed in alpha, beta or
stable works fine for standard page views.

As for the situations where an entire page is loaded via the api it
makes no difference to us to whether we
1) send the same header (set via javascript) or
2) add a query string parameter.

The only advantage I can see of using a header is that an initial page
load of the article San Francisco currently uses the same api url as a
page load of the article San Francisco via javascript (e.g. I click a
link to 'San Francisco' on the California article).

In this new method they would use different urls (as the data sent is
different). I'm not sure how that would effect caching.

Let us know which method is preferred. From my perspective
implementation of either is easy.

[1] http://www.mediawiki.org/wiki/MobileFrontend/Dynamic_Sections

On Mon, Feb 11, 2013 at 12:50 PM, Asher Feldman  wrote:
> Max - good answers re: caching concerns.  That leaves studying if the bytes
> transferred on average mobile article view increases or decreases with lazy
> section loading.  If it increases, I'd say this isn't a positive direction
> to go in and stop there.  If it decreases, then we should look at the
> effect on total latency, number of requests required per pageview, and the
> impact on backend apache utilization which I'd expect to be > 0.
>
> Does the mobile team have specific goals that this project aims to
> accomplish?  If so, we can use those as the measure against which to
> compare an impact analysis.
>
> On Mon, Feb 11, 2013 at 12:21 PM, Max Semenik  wrote:
>
>> On 11.02.2013, 22:11 Asher wrote:
>>
>> > And then I'd wonder about the server side implementation. How will
>> frontend
>> > cache invalidation work? Are we going to need to purge every individual
>> > article section relative to /w/api.php on edit?
>>
>> Since the API doesn't require pretty URLs, we could simply append the
>> current revision ID to the mobileview URLs.
>>
>> > Article HTML in memcached
>> > (parser cache), mobile processed HTML in memcached.. Now individual
>> > sections in memcached? If so, should we calculate memcached space needs
>> for
>> > article text as 3x the current parser cache utilization? More memcached
>> > usage is great, not asking to dissuade its use but because its better to
>> > capacity plan than to react.
>>
>> action=mobileview caches pages only in full and serves
>> only sections requested, so no changes in request patterns will result
>> in increased memcached usage.
>>
>> --
>> Best regards,
>>   Max Semenik ([[User:MaxSem]])
>>
>>
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



-- 
Jon Robson
http://jonrobson.me.uk
@rakugojon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-11 Thread Asher Feldman
Max - good answers re: caching concerns.  That leaves studying if the bytes
transferred on average mobile article view increases or decreases with lazy
section loading.  If it increases, I'd say this isn't a positive direction
to go in and stop there.  If it decreases, then we should look at the
effect on total latency, number of requests required per pageview, and the
impact on backend apache utilization which I'd expect to be > 0.

Does the mobile team have specific goals that this project aims to
accomplish?  If so, we can use those as the measure against which to
compare an impact analysis.

On Mon, Feb 11, 2013 at 12:21 PM, Max Semenik  wrote:

> On 11.02.2013, 22:11 Asher wrote:
>
> > And then I'd wonder about the server side implementation. How will
> frontend
> > cache invalidation work? Are we going to need to purge every individual
> > article section relative to /w/api.php on edit?
>
> Since the API doesn't require pretty URLs, we could simply append the
> current revision ID to the mobileview URLs.
>
> > Article HTML in memcached
> > (parser cache), mobile processed HTML in memcached.. Now individual
> > sections in memcached? If so, should we calculate memcached space needs
> for
> > article text as 3x the current parser cache utilization? More memcached
> > usage is great, not asking to dissuade its use but because its better to
> > capacity plan than to react.
>
> action=mobileview caches pages only in full and serves
> only sections requested, so no changes in request patterns will result
> in increased memcached usage.
>
> --
> Best regards,
>   Max Semenik ([[User:MaxSem]])
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-11 Thread Max Semenik
On 11.02.2013, 22:11 Asher wrote:

> And then I'd wonder about the server side implementation. How will frontend
> cache invalidation work? Are we going to need to purge every individual
> article section relative to /w/api.php on edit?

Since the API doesn't require pretty URLs, we could simply append the
current revision ID to the mobileview URLs.

> Article HTML in memcached
> (parser cache), mobile processed HTML in memcached.. Now individual
> sections in memcached? If so, should we calculate memcached space needs for
> article text as 3x the current parser cache utilization? More memcached
> usage is great, not asking to dissuade its use but because its better to
> capacity plan than to react.

action=mobileview caches pages only in full and serves
only sections requested, so no changes in request patterns will result
in increased memcached usage.

-- 
Best regards,
  Max Semenik ([[User:MaxSem]])


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-11 Thread Asher Feldman
On Monday, February 11, 2013, Mark Bergsma wrote:

>
> On Feb 9, 2013, at 11:21 PM, Asher Feldman 
> >
> wrote:
>
> > Whether or not it makes sense for mobile to move in the direction of
> > splitting up article views into many api requests is something I'd love
> to
> > see backed up by data.  I'm skeptical for multiple reasons.
>
> What is the main motivation used here? Reducing article sizes/transfers at
> the expense of more latency?
>

In cases where most sections (probably not even all) are loaded, I'd expect
it to increase the amount of data transfered beyond just the overhead of
the additional requests. gzip might take a 30k article down to 4k but
will be less efficient on individual sections. Text compresses really well,
and roundtrip latency is high on many cell networks.

And then I'd wonder about the server side implementation. How will frontend
cache invalidation work? Are we going to need to purge every individual
article section relative to /w/api.php on edit? Article HTML in memcached
(parser cache), mobile processed HTML in memcached.. Now individual
sections in memcached? If so, should we calculate memcached space needs for
article text as 3x the current parser cache utilization? More memcached
usage is great, not asking to dissuade its use but because its better to
capacity plan than to react.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-11 Thread Mark Bergsma

On Feb 9, 2013, at 11:21 PM, Asher Feldman  wrote:
> For this particular case, the API requests are for either getting specific
> sections of an article as opposed to either the whole thing, or the first
> section as part of an initial pageview.  I might not have grokked the
> original RFC email well, but I don't understand why this was being
> discussed as a logging challenge or necessitating a request header.  A
> mobile api request to just get section 3 of the article on otters should
> already utilize a query param denoting that section 3 is being fetched, and
> is already clearly not a "primary" request.

Yes, that part remains a bit unclear to me as well - some more details would be 
welcome.

> Whether or not it makes sense for mobile to move in the direction of
> splitting up article views into many api requests is something I'd love to
> see backed up by data.  I'm skeptical for multiple reasons.

What is the main motivation used here? Reducing article sizes/transfers at the 
expense of more latency?

-- 
Mark Bergsma 
Lead Operations Architect
Wikimedia Foundation





___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-09 Thread Asher Feldman
On Thu, Feb 7, 2013 at 4:32 AM, Mark Bergsma  wrote:

> > - Since we're repurposing X-CS, should we perhaps rename it to something
> > more apt to address concerns about cryptic non-standard headers flying
> > about?
>
> I'd like to propose to define *one* request header to be used for all
> analytics purposes. It can be key/value pairs, and be set client side where
> applicable.


There's been some confusion in this thread between headers used by
mediawiki in determining content generation or for cache variance, and
those intended only for logging.  The zero carrier header is used by the
zero extension to return specific content banners and set different default
behaviors (i.e. hide all images) as negotiated with individual mobile
carriers.  A reader familiar with this might note that their are separate
X-CS and X-Carrier headers but X-Carrier is supposed to go away now.

Agreed that there should be a single header for content that's strictly for
analytics purposes.  All changes to the udplog format in the last year or
so could likely be reverted except for the delimiter change, with a
multipurpose analytics key/value field added for all else.


> I think the question of using a URL param vs a request header should
> mainly take into account whether the response varies on the value of the
> parameter. If the responses are otherwise identical, and the value is only
> used for analytics purposes, I would prefer to put that into the above
> header instead, as it will impair cacheability / cache size otherwise (even
> if those requests are currently not cacheable for other reasons). If the
> responses are actually different based on this parameter, I would prefer to
> have it in the URL where possible.
>

For this particular case, the API requests are for either getting specific
sections of an article as opposed to either the whole thing, or the first
section as part of an initial pageview.  I might not have grokked the
original RFC email well, but I don't understand why this was being
discussed as a logging challenge or necessitating a request header.  A
mobile api request to just get section 3 of the article on otters should
already utilize a query param denoting that section 3 is being fetched, and
is already clearly not a "primary" request.

Whether or not it makes sense for mobile to move in the direction of
splitting up article views into many api requests is something I'd love to
see backed up by data.  I'm skeptical for multiple reasons.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-07 Thread David Schoonover
>
> I'd like to propose to define *one* request header to be used for all
> analytics purposes. It can be key/value pairs, and be set client side where
> applicable. Varnish can append to it where needed, later keys overriding
> earlier ones. Then we can log that one header across all HTTP/caching
> clusters without having to change the log stream all the time, and without
> wasting much space, and caching edge configuration changes are kept to a
> minimum as well.


Agreed. Instrumentation should ideally never get in the way of production
performance, so if we can cut or optimize header use for logging without
being too onerous, we'll happily do so. afaik, the reasons that custom HTTP
headers are used at all are:
- They're accessible from varnishncsa without code modifications;
- Varnish and/or other parties in the request chain can munge the values
prior to logging to save bytes (examples being X-CS, which replaces the
semantic carrier name with a [vastly shorter] numeric code, and the
proposed X-MF-Mode header, which prevents the need to log the whole cookies
header for post-processing).

Ideally, none of this should need to make a trip to the client. I don't
recall seeing anything in the Varnish docs providing a way to send values
exclusively to the loggers, but if there is, that's an easy win, and it
wouldn't require any changes to our parsing pipeline.

If that's not possible, it makes sense to collapse various headers into a
KV field; that would require changes on our side, including all downstream
consumers of the log stream (which is surprisingly large), so it's not a
trivial move.

--
David Schoonover
d...@wikimedia.org
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-07 Thread Mark Bergsma

On Feb 6, 2013, at 9:32 PM, David Schoonover  wrote:

> Just want to summarize and make sure I've got the right conclusions, as
> this thread has wandered a bit.
> 
> *1. X-MF-Mode: Alpha/Beta Site Usage*
> *
> *
> We'll roll this into the X-CS header, which will now be KV-pairs (using
> normal URL encoding), and set by Varnish. This will avoid an explosion of
> cryptic headers for analytic purposes.
> 
> Questions:
> - It seems there's some confusion around "bypassing Varnish". If I
> understand correctly, it's not that Varnish is ever bypassed, just that the
> upstream response is not cached if cookies are present. Is that right?

Yes

> - Since we're repurposing X-CS, should we perhaps rename it to something
> more apt to address concerns about cryptic non-standard headers flying
> about?

I'd like to propose to define *one* request header to be used for all analytics 
purposes. It can be key/value pairs, and be set client side where applicable. 
Varnish can append to it where needed, later keys overriding earlier ones. Then 
we can log that one header across all HTTP/caching clusters without having to 
change the log stream all the time, and without wasting much space, and caching 
edge configuration changes are kept to a minimum as well.

And we might as well be transparent in its naming. header name 
"Log-Parameters:"?

> *2. X-MF-Req: Primary vs Secondary API Requests*
> 
> This header will be replaced with a query parameter set by the client-side
> JS code making the request. Analytics will parse it out at processing time
> and Do The Right Thing.


I think the question of using a URL param vs a request header should mainly 
take into account whether the response varies on the value of the parameter. If 
the responses are otherwise identical, and the value is only used for analytics 
purposes, I would prefer to put that into the above header instead, as it will 
impair cacheability / cache size otherwise (even if those requests are 
currently not cacheable for other reasons). If the responses are actually 
different based on this parameter, I would prefer to have it in the URL where 
possible.

-- 
Mark Bergsma 
Lead Operations Architect
Wikimedia Foundation





___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-06 Thread Asher Feldman
On Wednesday, February 6, 2013, David Schoonover wrote:

> That all sounds fine to me so long as we're all agreed.


Lol. RFC closed.


> --
> David Schoonover
> d...@wikimedia.org 
>
>
> On Wed, Feb 6, 2013 at 12:59 PM, Asher Feldman 
> 
> >wrote:
>
> > On Wednesday, February 6, 2013, David Schoonover wrote:
> >
> > > Just want to summarize and make sure I've got the right conclusions, as
> > > this thread has wandered a bit.
> > >
> > > *1. X-MF-Mode: Alpha/Beta Site Usage*
> > > *
> > > *
> > > We'll roll this into the X-CS header, which will now be KV-pairs (using
> > > normal URL encoding), and set by Varnish.
> >
> >
> > Nope. There will be a header denoting non-standard MobileFrontend views
> if
> > the mobile team wants to leave the caching situation as is. It will be a
> > response header set by mediawiki, not varnish. The header will have a
> > unique name, it will not share the name of the zero carrier header. The
> > udplog field that currently only ever contains carrier information on
> zero
> > requests will become a key value field. Udplog fields are not named, they
> > are positional.
> >
> >
> > >  This will avoid an explosion of
> > > cryptic headers for analytic purposes.
> > >
> > > Questions:
> > > - It seems there's some confusion around "bypassing Varnish". If I
> > > understand correctly, it's not that Varnish is ever bypassed, just that
> > the
> > > upstream response is not cached if cookies are present. Is that right?
> >
> >
> > "Bypasses varnish caching" != "bypassing varnish."  I don't see any use
> of
> > the later in this thread, but if there has been confusion, know that all
> > m.wikipedia.org requests are served via varnish.
> >
> >
> > > - Since we're repurposing X-CS, should we perhaps rename it to
> something
> > > more apt to address concerns about cryptic non-standard headers flying
> > > about?
> >
> >
> > Nope.. We're repurposing the fixed position udplog field, not the zero
> > carrier code header.
> >
> >
> > >
> > >
> > > *2. X-MF-Req: Primary vs Secondary API Requests*
> > >
> > > This header will be replaced with a query parameter set by the
> > client-side
> > > JS code making the request. Analytics will parse it out at processing
> > time
> > > and Do The Right Thing.
> > >
> > >
> > > Kindly correct me if I've gotten anything wrong.
> > >
> > >
> > > --
> > > David Schoonover
> > > d...@wikimedia.org 
> > >
> > >
> > > On Tue, Feb 5, 2013 at 2:36 PM, Diederik van Liere
> > > >wrote:
> > >
> > > > > Analytics folks, is this workable from your perspective?
> > > > >
> > > > > Yes, this works fine for us and it's also no problem to set
> multiple
> > > > key/value pairs in the http header that we are now using for the X-CS
> > > > header.
> > > > Diederik
> > > > ___
> > > > Wikitech-l mailing list
> > > > Wikitech-l@lists.wikimedia.org 
> > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > >
> > > ___
> > > Wikitech-l mailing list
> > > Wikitech-l@lists.wikimedia.org 
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org 
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org 
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-06 Thread David Schoonover
That all sounds fine to me so long as we're all agreed.

--
David Schoonover
d...@wikimedia.org


On Wed, Feb 6, 2013 at 12:59 PM, Asher Feldman wrote:

> On Wednesday, February 6, 2013, David Schoonover wrote:
>
> > Just want to summarize and make sure I've got the right conclusions, as
> > this thread has wandered a bit.
> >
> > *1. X-MF-Mode: Alpha/Beta Site Usage*
> > *
> > *
> > We'll roll this into the X-CS header, which will now be KV-pairs (using
> > normal URL encoding), and set by Varnish.
>
>
> Nope. There will be a header denoting non-standard MobileFrontend views if
> the mobile team wants to leave the caching situation as is. It will be a
> response header set by mediawiki, not varnish. The header will have a
> unique name, it will not share the name of the zero carrier header. The
> udplog field that currently only ever contains carrier information on zero
> requests will become a key value field. Udplog fields are not named, they
> are positional.
>
>
> >  This will avoid an explosion of
> > cryptic headers for analytic purposes.
> >
> > Questions:
> > - It seems there's some confusion around "bypassing Varnish". If I
> > understand correctly, it's not that Varnish is ever bypassed, just that
> the
> > upstream response is not cached if cookies are present. Is that right?
>
>
> "Bypasses varnish caching" != "bypassing varnish."  I don't see any use of
> the later in this thread, but if there has been confusion, know that all
> m.wikipedia.org requests are served via varnish.
>
>
> > - Since we're repurposing X-CS, should we perhaps rename it to something
> > more apt to address concerns about cryptic non-standard headers flying
> > about?
>
>
> Nope.. We're repurposing the fixed position udplog field, not the zero
> carrier code header.
>
>
> >
> >
> > *2. X-MF-Req: Primary vs Secondary API Requests*
> >
> > This header will be replaced with a query parameter set by the
> client-side
> > JS code making the request. Analytics will parse it out at processing
> time
> > and Do The Right Thing.
> >
> >
> > Kindly correct me if I've gotten anything wrong.
> >
> >
> > --
> > David Schoonover
> > d...@wikimedia.org
> >
> >
> > On Tue, Feb 5, 2013 at 2:36 PM, Diederik van Liere
> > wrote:
> >
> > > > Analytics folks, is this workable from your perspective?
> > > >
> > > > Yes, this works fine for us and it's also no problem to set multiple
> > > key/value pairs in the http header that we are now using for the X-CS
> > > header.
> > > Diederik
> > > ___
> > > Wikitech-l mailing list
> > > Wikitech-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-06 Thread Asher Feldman
On Wednesday, February 6, 2013, David Schoonover wrote:

> Just want to summarize and make sure I've got the right conclusions, as
> this thread has wandered a bit.
>
> *1. X-MF-Mode: Alpha/Beta Site Usage*
> *
> *
> We'll roll this into the X-CS header, which will now be KV-pairs (using
> normal URL encoding), and set by Varnish.


Nope. There will be a header denoting non-standard MobileFrontend views if
the mobile team wants to leave the caching situation as is. It will be a
response header set by mediawiki, not varnish. The header will have a
unique name, it will not share the name of the zero carrier header. The
udplog field that currently only ever contains carrier information on zero
requests will become a key value field. Udplog fields are not named, they
are positional.


>  This will avoid an explosion of
> cryptic headers for analytic purposes.
>
> Questions:
> - It seems there's some confusion around "bypassing Varnish". If I
> understand correctly, it's not that Varnish is ever bypassed, just that the
> upstream response is not cached if cookies are present. Is that right?


"Bypasses varnish caching" != "bypassing varnish."  I don't see any use of
the later in this thread, but if there has been confusion, know that all
m.wikipedia.org requests are served via varnish.


> - Since we're repurposing X-CS, should we perhaps rename it to something
> more apt to address concerns about cryptic non-standard headers flying
> about?


Nope.. We're repurposing the fixed position udplog field, not the zero
carrier code header.


>
>
> *2. X-MF-Req: Primary vs Secondary API Requests*
>
> This header will be replaced with a query parameter set by the client-side
> JS code making the request. Analytics will parse it out at processing time
> and Do The Right Thing.
>
>
> Kindly correct me if I've gotten anything wrong.
>
>
> --
> David Schoonover
> d...@wikimedia.org
>
>
> On Tue, Feb 5, 2013 at 2:36 PM, Diederik van Liere
> wrote:
>
> > > Analytics folks, is this workable from your perspective?
> > >
> > > Yes, this works fine for us and it's also no problem to set multiple
> > key/value pairs in the http header that we are now using for the X-CS
> > header.
> > Diederik
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-06 Thread David Schoonover
Just want to summarize and make sure I've got the right conclusions, as
this thread has wandered a bit.

*1. X-MF-Mode: Alpha/Beta Site Usage*
*
*
We'll roll this into the X-CS header, which will now be KV-pairs (using
normal URL encoding), and set by Varnish. This will avoid an explosion of
cryptic headers for analytic purposes.

Questions:
- It seems there's some confusion around "bypassing Varnish". If I
understand correctly, it's not that Varnish is ever bypassed, just that the
upstream response is not cached if cookies are present. Is that right?
- Since we're repurposing X-CS, should we perhaps rename it to something
more apt to address concerns about cryptic non-standard headers flying
about?


*2. X-MF-Req: Primary vs Secondary API Requests*

This header will be replaced with a query parameter set by the client-side
JS code making the request. Analytics will parse it out at processing time
and Do The Right Thing.


Kindly correct me if I've gotten anything wrong.


--
David Schoonover
d...@wikimedia.org


On Tue, Feb 5, 2013 at 2:36 PM, Diederik van Liere
wrote:

> > Analytics folks, is this workable from your perspective?
> >
> > Yes, this works fine for us and it's also no problem to set multiple
> key/value pairs in the http header that we are now using for the X-CS
> header.
> Diederik
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-05 Thread Diederik van Liere
> Analytics folks, is this workable from your perspective?
>
> Yes, this works fine for us and it's also no problem to set multiple
key/value pairs in the http header that we are now using for the X-CS
header.
Diederik
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-05 Thread Arthur Richards
On Mon, Feb 4, 2013 at 7:12 PM, Asher Feldman wrote:

> On Mon, Feb 4, 2013 at 5:21 PM, Asher Feldman  >wrote:
>
> > On Mon, Feb 4, 2013 at 4:59 PM, Arthur Richards  >wrote:
> >
> >> In the case of the cookie, the header would actually get set by the
> >> backend
> >> response (from Apache) and I believe Dave cooked up or was planning on
> >> cooking some magic to somehow make that information discernable when
> >> results are cached.
> >>
> >
> > Opting into the mobile beta as it is currently implemented bypasses
> > varnish caching for all future mobile pageviews for the life of the
> > cookie.  So this probably isn't quite right (at least the "when results
> are
> > cached" part.)
> >
>
> Thinking about this further.. So long as all beta optins bypass all caching
> and always have to hit an apache, it would be fine for mf to set a response
> header reflecting the version of the site the optin cookie triggers (but
> only if there's an optin, avoid setting on standard.)  I'd just prefer this
> to be logged without adding a field to the entire udplog stream that will
> generally just be wasted space.  Mobile already has one dedicated udplog
> field currently intended for zero carriers, wasted log space for nearly
> every request.  Make it a key/value field that can contain multiple keys,
> i.e. "zc:orn;v:b1" (zero carrier = orange whatever, version = beta1)
>
> If by some chance mobile beta gets implemented in a way that doesn't kill
> frontend caching for its users (maybe solely via different js behavior
> based on the presence of the optin cookie?) the above won't be applicable
> anymore, so using the event log facility / pixel service to note beta usage
> becomes more appropriate.  If beta usage is going to be driven upwards, I
> hope this approach is seriously considered.  Mobile currently only has
> around a 58% edge cache hitrate as it is and it sounds like upcoming
> features will place significant new demands on the apaches and for
> memcached space.  If a non cache busting beta site is doable, go for the
> logging method now that will later be compatible with it to avoid having to
> change processing methods.
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

OK - this is all making a lot more sense to me now, thanks for your
clarifications and suggestions, Asher.

So, from the mobile team's perspective a straightforward implementation to
get us to our goal might be to:
1) add a query parameter to identify 'secondary' API hits (eg an API
request for page content made after an initial request for that page was
made, all other requests stay the same)
2) use the header solution to identify beta/alpha cookies (HTTP header set
by backend response when user is opted in).

One thing I'd like to double check though is that 'Opting into the mobile
beta as it is currently implemented bypasses varnish caching for all future
mobile pageviews for the life of the cookie' - I thought the Varnish cache
was just varied by the optin cookies, not totally bypassed. I've looked at
headers from some sample requests I've made with the beta opt-in and I'm
not seeing any cache hits, so I gather you are correct. Can you please
confirm this?

Analytics folks, is this workable from your perspective?

-- 
Arthur Richards
Software Engineer, Mobile
[[User:Awjrichards]]
IRC: awjr
+1-415-839-6885 x6687
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-04 Thread Asher Feldman
On Mon, Feb 4, 2013 at 5:21 PM, Asher Feldman wrote:

> On Mon, Feb 4, 2013 at 4:59 PM, Arthur Richards 
> wrote:
>
>> In the case of the cookie, the header would actually get set by the
>> backend
>> response (from Apache) and I believe Dave cooked up or was planning on
>> cooking some magic to somehow make that information discernable when
>> results are cached.
>>
>
> Opting into the mobile beta as it is currently implemented bypasses
> varnish caching for all future mobile pageviews for the life of the
> cookie.  So this probably isn't quite right (at least the "when results are
> cached" part.)
>

Thinking about this further.. So long as all beta optins bypass all caching
and always have to hit an apache, it would be fine for mf to set a response
header reflecting the version of the site the optin cookie triggers (but
only if there's an optin, avoid setting on standard.)  I'd just prefer this
to be logged without adding a field to the entire udplog stream that will
generally just be wasted space.  Mobile already has one dedicated udplog
field currently intended for zero carriers, wasted log space for nearly
every request.  Make it a key/value field that can contain multiple keys,
i.e. "zc:orn;v:b1" (zero carrier = orange whatever, version = beta1)

If by some chance mobile beta gets implemented in a way that doesn't kill
frontend caching for its users (maybe solely via different js behavior
based on the presence of the optin cookie?) the above won't be applicable
anymore, so using the event log facility / pixel service to note beta usage
becomes more appropriate.  If beta usage is going to be driven upwards, I
hope this approach is seriously considered.  Mobile currently only has
around a 58% edge cache hitrate as it is and it sounds like upcoming
features will place significant new demands on the apaches and for
memcached space.  If a non cache busting beta site is doable, go for the
logging method now that will later be compatible with it to avoid having to
change processing methods.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-04 Thread Asher Feldman
On Mon, Feb 4, 2013 at 4:59 PM, Arthur Richards wrote:

> In the case of the cookie, the header would actually get set by the backend
> response (from Apache) and I believe Dave cooked up or was planning on
> cooking some magic to somehow make that information discernable when
> results are cached.
>

Opting into the mobile beta as it is currently implemented bypasses varnish
caching for all future mobile pageviews for the life of the cookie.  So
this probably isn't quite right (at least the "when results are cached"
part.)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-04 Thread Asher Feldman
On Mon, Feb 4, 2013 at 4:24 PM, Arthur Richards wrote:

>
> Asher, I understand your hesitation about using HTTP header fields, but
> there are a couple problems I'm seeing with using query string parameters.
> Perhaps you or others have some ideas how to get around these:
> * We should keep user-facing URLs canonical as much as possible (primarily
> for link sharing)
> ** If we keep user-facing URLs canonical, we could potentially add query
> string params via javascript, but that would only work on devices that
> support javascript/have javascript enabled (this might not be a huge deal
> as we are planning changes such that users that do not support jQuery will
> get a simplified version of the stable site)
>

I was thinking of this as a solution for the X-MF-Req header, based on your
explanation of it earlier in the the thread: "Almost correct - I realize I
didn't actually explain it correctly. This would be a request HTTP header
set by the client in API requests made by Javascript provided by
MobileFrontend."

I only meant to apply the query string idea to API requests, which can also
be marked to indicate non-standard versions of the site.  I completely
missed the case of non-api requests about which beta/alpha usage data needs
to be collected.  What about doing so via the eventlog service?  Only for
users actually opted into one of these programs, no need to log anything
special for the majority of users getting the standard site.

* How could this work for the first pageview request (eg a user clicking a
> link from Google or even just browsing to http://en.wikipedia.org)?


I think this is covered by the above, in that the data intended to go into
x-mf-req doesn't apply to this sort of page view, and first views from
users opted into a trial can eventlog the trial usage.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-04 Thread Arthur Richards
On Mon, Feb 4, 2013 at 5:49 PM, Brion Vibber  wrote:

> On Mon, Feb 4, 2013 at 4:38 PM, Arthur Richards  >wrote:
>
> > On Mon, Feb 4, 2013 at 5:30 PM, Brion Vibber  wrote:
> >
> > > On Mon, Feb 4, 2013 at 4:24 PM, Arthur Richards <
> aricha...@wikimedia.org
> > > >wrote:
> > > * How could this work for the first pageview request (eg a user
> clicking
> > a
> > > > link from Google or even just browsing to http://en.wikipedia.org)?
> > > >
> > >
> > > I think mainly we need the tracking on the API requests... that's all
> > > JavaScript-initiated, and all hidden from the user. The main problem
> with
> > > adding parameters would be for caching  but none of the API hits
> are
> > > currently cacheable so that's not an immediate issue perhaps.
> > >
> >
> > We also need to be able to differentiate between alpha/beta/stable
> versions
> > of the mobile site, without having to parse the cookie header (I believe
> as
> > a result of performance constraints around this? I think the analytics
> team
> > had looked into this previously).
> >
>
> Yeah that's probably not possible if you want to track that for initial
> page views. Cookie's the only thing guaranteed to have the data available,
> and we have no way to inject a header into mobile web browsers except for
> the XHR hits to the API.
>
> -- brion
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

In the case of the cookie, the header would actually get set by the backend
response (from Apache) and I believe Dave cooked up or was planning on
cooking some magic to somehow make that information discernable when
results are cached.


-- 
Arthur Richards
Software Engineer, Mobile
[[User:Awjrichards]]
IRC: awjr
+1-415-839-6885 x6687
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-04 Thread Brion Vibber
On Mon, Feb 4, 2013 at 4:38 PM, Arthur Richards wrote:

> On Mon, Feb 4, 2013 at 5:30 PM, Brion Vibber  wrote:
>
> > On Mon, Feb 4, 2013 at 4:24 PM, Arthur Richards  > >wrote:
> > * How could this work for the first pageview request (eg a user clicking
> a
> > > link from Google or even just browsing to http://en.wikipedia.org)?
> > >
> >
> > I think mainly we need the tracking on the API requests... that's all
> > JavaScript-initiated, and all hidden from the user. The main problem with
> > adding parameters would be for caching  but none of the API hits are
> > currently cacheable so that's not an immediate issue perhaps.
> >
>
> We also need to be able to differentiate between alpha/beta/stable versions
> of the mobile site, without having to parse the cookie header (I believe as
> a result of performance constraints around this? I think the analytics team
> had looked into this previously).
>

Yeah that's probably not possible if you want to track that for initial
page views. Cookie's the only thing guaranteed to have the data available,
and we have no way to inject a header into mobile web browsers except for
the XHR hits to the API.

-- brion
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-04 Thread Arthur Richards
On Mon, Feb 4, 2013 at 5:30 PM, Brion Vibber  wrote:

> On Mon, Feb 4, 2013 at 4:24 PM, Arthur Richards  >wrote:
>
> > Asher, I understand your hesitation about using HTTP header fields, but
> > there are a couple problems I'm seeing with using query string
> parameters.
> > Perhaps you or others have some ideas how to get around these:
> > * We should keep user-facing URLs canonical as much as possible
> (primarily
> > for link sharing)
> > ** If we keep user-facing URLs canonical, we could potentially add query
> > string params via javascript, but that would only work on devices that
> > support javascript/have javascript enabled (this might not be a huge deal
> > as we are planning changes such that users that do not support jQuery
> will
> > get a simplified version of the stable site)
>
> * How could this work for the first pageview request (eg a user clicking a
> > link from Google or even just browsing to http://en.wikipedia.org)?
> >
>
> I think mainly we need the tracking on the API requests... that's all
> JavaScript-initiated, and all hidden from the user. The main problem with
> adding parameters would be for caching  but none of the API hits are
> currently cacheable so that's not an immediate issue perhaps.
>

We also need to be able to differentiate between alpha/beta/stable versions
of the mobile site, without having to parse the cookie header (I believe as
a result of performance constraints around this? I think the analytics team
had looked into this previously).

-- 
Arthur Richards
Software Engineer, Mobile
[[User:Awjrichards]]
IRC: awjr
+1-415-839-6885 x6687
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-04 Thread Brion Vibber
On Mon, Feb 4, 2013 at 4:24 PM, Arthur Richards wrote:

> Asher, I understand your hesitation about using HTTP header fields, but
> there are a couple problems I'm seeing with using query string parameters.
> Perhaps you or others have some ideas how to get around these:
> * We should keep user-facing URLs canonical as much as possible (primarily
> for link sharing)
> ** If we keep user-facing URLs canonical, we could potentially add query
> string params via javascript, but that would only work on devices that
> support javascript/have javascript enabled (this might not be a huge deal
> as we are planning changes such that users that do not support jQuery will
> get a simplified version of the stable site)

* How could this work for the first pageview request (eg a user clicking a
> link from Google or even just browsing to http://en.wikipedia.org)?
>

I think mainly we need the tracking on the API requests... that's all
JavaScript-initiated, and all hidden from the user. The main problem with
adding parameters would be for caching  but none of the API hits are
currently cacheable so that's not an immediate issue perhaps.

-- brion
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-04 Thread Arthur Richards
On Sun, Feb 3, 2013 at 2:35 AM, Asher Feldman wrote:

> Regarding varnish cacheability of mobile API requests with a logging query
> param - it would probably be worth making frontend varnishes strip out all
> occurrences of that query param and its value from their backend requests
> so they're all the same to the caching instances. A generic param name that
> can take any value would allow for adding as many extra log values as
> needed, limited only by the uri log field length.
>
> &l=mft2&l=mfstable etc.
>
> So still an edge cache change but the result is more flexible
> while avoiding changing the fixed field length log format across unrelated
> systems like text squids or image caches.
>
> On Sunday, February 3, 2013, Asher Feldman wrote:
>
> > If you want to differentiate categories of API requests in logs, add
> > descriptive noop query params to the requests. I.e &mfmode=2. Doing this
> in
> > request headers and altering edge config is unnecessary and a bad design
> > pattern. On the analytics side, if parsing query params seems challenging
> > vs. having a fixed field to parse, deal.
> >
>

Asher, I understand your hesitation about using HTTP header fields, but
there are a couple problems I'm seeing with using query string parameters.
Perhaps you or others have some ideas how to get around these:
* We should keep user-facing URLs canonical as much as possible (primarily
for link sharing)
** If we keep user-facing URLs canonical, we could potentially add query
string params via javascript, but that would only work on devices that
support javascript/have javascript enabled (this might not be a huge deal
as we are planning changes such that users that do not support jQuery will
get a simplified version of the stable site)
* How could this work for the first pageview request (eg a user clicking a
link from Google or even just browsing to http://en.wikipedia.org)?

I may be missing other potential problems - it would be great if others
from the mobile team could chime in.

-- 
Arthur Richards
Software Engineer, Mobile
[[User:Awjrichards]]
IRC: awjr
+1-415-839-6885 x6687
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-03 Thread Asher Feldman
On Sunday, February 3, 2013, Tyler Romeo wrote:

> Remind me again why a production setup is logging every header of every
> request?


That's ludicrous. Please reread our udplog format documentation and this
entire thread carefully, especially the first message before commenting any
further.


>  Also, if you are logging every header, then the amount of data
> added by a single extra header would be insignificant compared to the rest
> of the request.
>
> *--*
> *Tyler Romeo*
> Stevens Institute of Technology, Class of 2015
> Major in Computer Science
> www.whizkidztech.com | tylerro...@gmail.com 
>
>
> On Sun, Feb 3, 2013 at 5:12 AM, Asher Feldman 
> 
> >wrote:
>
> > That's not at all true in the real world. Look at the actual requests for
> > google analytics on a high percentage of sites, etc.
> >
> > Setting new request headers for mobile that map to new inflexible fields
> in
> > the log stream that must be set on all non mobile requests ("\t-\t-")
> > equals gigabytes of unnecessarily log data every day (that we want
> > to save 100% of) for no good reason. Wanting to keep query params "pure"
> > isn't a good reason.
> >
> > On Sunday, February 3, 2013, Tyler Romeo wrote:
> >
> > > Considering that the query component of a URI is meant to identify the
> > > resource whereas HTTP headers are meant to tell the server additional
> > > information about the request, I think a header approach is much more
> > > appropriate than a no-op query parameter.
> > >
> > > If the X- is removed, I'd have no problem with the addition of these
> > > headers, but what is the advantage of having two over one. Wouldn't a
> > > header like:
> > > MobileFrontend: 1/2 a/b/s
> > > work just as fine?
> > >
> > > *--*
> > > *Tyler Romeo*
> > > Stevens Institute of Technology, Class of 2015
> > > Major in Computer Science
> > > www.whizkidztech.com | tylerro...@gmail.com 
> > >
> > >
> > > On Sun, Feb 3, 2013 at 4:35 AM, Asher Feldman 
> > > 
> > 
> > > >wrote:
> > >
> > > > Regarding varnish cacheability of mobile API requests with a logging
> > > query
> > > > param - it would probably be worth making frontend varnishes strip
> out
> > > all
> > > > occurrences of that query param and its value from their backend
> > requests
> > > > so they're all the same to the caching instances. A generic param
> name
> > > that
> > > > can take any value would allow for adding as many extra log values as
> > > > needed, limited only by the uri log field length.
> > > >
> > > > &l=mft2&l=mfstable etc.
> > > >
> > > > So still an edge cache change but the result is more flexible
> > > > while avoiding changing the fixed field length log format across
> > > unrelated
> > > > systems like text squids or image caches.
> > > >
> > > > On Sunday, February 3, 2013, Asher Feldman wrote:
> > > >
> > > > > If you want to differentiate categories of API requests in logs,
> add
> > > > > descriptive noop query params to the requests. I.e &mfmode=2. Doing
> > > this
> > > > in
> > > > > request headers and altering edge config is unnecessary and a bad
> > > design
> > > > > pattern. On the analytics side, if parsing query params seems
> > > challenging
> > > > > vs. having a fixed field to parse, deal.
> > > > >
> > > > > On Sunday, February 3, 2013, David Schoonover wrote:
> > > > >
> > > > >> Huh! News to me as well. I definitely agree with that decision.
> > > Thanks,
> > > > >> Ori!
> > > > >>
> > > > >> I've already written the Varnish code for setting X-MF-Mode so it
> > can
> > > be
> > > > >> captured by varnishncsa. Is there agreement to switch to
> > Mobile-Mode,
> > > or
> > > > >> at
> > > > >> least, MF-Mode?
> > > > >>
> > > > >> Looking especially to hear from Arthur and Matt.
> > > > >>
> > > > >> --
> > > > >> David Schoonover
> > > > >> d...@wikimedia.org
> > > > >>
> > > > >>
> > > > >> On Sat, Feb 2, 2013 at 2:16 PM, Diederik van Liere
> > > > >> wrote:
> > > > >>
> > > > >> > Thanks Ori, I was not aware of this
> > > > >> > D
> > > > >> >
> > > > >> > Sent from my iPhone
> > > > >> >
> > > > >> > On 2013-02-02, at 16:55, Ori Livneh  wrote:
> > > > >> >
> > > > >> > >
> > > > >> > >
> > > > >> > > On Saturday, February 2, 2013 at 1:36 PM, Platonides wrote:
> > > > >> > >
> > > > >> > >> I don't like it's cryptic nature.
> > > > >> > >>
> > > > >> > >> Someone looking at the headers sent to his browser would be
> > very
> > > > >> > >> confused about what's the point of «X-MF-Mode: b».
> > > > >> > >>
> > > > >> > >> Instead something like this would be much more descriptive:
> > > > >> > >> X-Mobile-Mode: stable
> > > > >> > >> X-Mobile-Request: secondary
> > > > >> > >>
> > > > >> > >> But that also means sending more bytes through the wire :S
> > > > >> > > Well, you can (and should) drop the 'X-' :-)
> > > > >> > >
> > > > >> > > See http://tools.ietf.org/html/rfc6648: Deprecating the "X-"
> > > Prefix
> > > > >> and
> > > > >> > Similar Constructs in Application Protocols
> > > > >> > >
> > > > >> > >
> > > > >> > > --
> > > 

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-03 Thread Tyler Romeo
Remind me again why a production setup is logging every header of every
request? Also, if you are logging every header, then the amount of data
added by a single extra header would be insignificant compared to the rest
of the request.

*--*
*Tyler Romeo*
Stevens Institute of Technology, Class of 2015
Major in Computer Science
www.whizkidztech.com | tylerro...@gmail.com


On Sun, Feb 3, 2013 at 5:12 AM, Asher Feldman wrote:

> That's not at all true in the real world. Look at the actual requests for
> google analytics on a high percentage of sites, etc.
>
> Setting new request headers for mobile that map to new inflexible fields in
> the log stream that must be set on all non mobile requests ("\t-\t-")
> equals gigabytes of unnecessarily log data every day (that we want
> to save 100% of) for no good reason. Wanting to keep query params "pure"
> isn't a good reason.
>
> On Sunday, February 3, 2013, Tyler Romeo wrote:
>
> > Considering that the query component of a URI is meant to identify the
> > resource whereas HTTP headers are meant to tell the server additional
> > information about the request, I think a header approach is much more
> > appropriate than a no-op query parameter.
> >
> > If the X- is removed, I'd have no problem with the addition of these
> > headers, but what is the advantage of having two over one. Wouldn't a
> > header like:
> > MobileFrontend: 1/2 a/b/s
> > work just as fine?
> >
> > *--*
> > *Tyler Romeo*
> > Stevens Institute of Technology, Class of 2015
> > Major in Computer Science
> > www.whizkidztech.com | tylerro...@gmail.com 
> >
> >
> > On Sun, Feb 3, 2013 at 4:35 AM, Asher Feldman  
> > >wrote:
> >
> > > Regarding varnish cacheability of mobile API requests with a logging
> > query
> > > param - it would probably be worth making frontend varnishes strip out
> > all
> > > occurrences of that query param and its value from their backend
> requests
> > > so they're all the same to the caching instances. A generic param name
> > that
> > > can take any value would allow for adding as many extra log values as
> > > needed, limited only by the uri log field length.
> > >
> > > &l=mft2&l=mfstable etc.
> > >
> > > So still an edge cache change but the result is more flexible
> > > while avoiding changing the fixed field length log format across
> > unrelated
> > > systems like text squids or image caches.
> > >
> > > On Sunday, February 3, 2013, Asher Feldman wrote:
> > >
> > > > If you want to differentiate categories of API requests in logs, add
> > > > descriptive noop query params to the requests. I.e &mfmode=2. Doing
> > this
> > > in
> > > > request headers and altering edge config is unnecessary and a bad
> > design
> > > > pattern. On the analytics side, if parsing query params seems
> > challenging
> > > > vs. having a fixed field to parse, deal.
> > > >
> > > > On Sunday, February 3, 2013, David Schoonover wrote:
> > > >
> > > >> Huh! News to me as well. I definitely agree with that decision.
> > Thanks,
> > > >> Ori!
> > > >>
> > > >> I've already written the Varnish code for setting X-MF-Mode so it
> can
> > be
> > > >> captured by varnishncsa. Is there agreement to switch to
> Mobile-Mode,
> > or
> > > >> at
> > > >> least, MF-Mode?
> > > >>
> > > >> Looking especially to hear from Arthur and Matt.
> > > >>
> > > >> --
> > > >> David Schoonover
> > > >> d...@wikimedia.org
> > > >>
> > > >>
> > > >> On Sat, Feb 2, 2013 at 2:16 PM, Diederik van Liere
> > > >> wrote:
> > > >>
> > > >> > Thanks Ori, I was not aware of this
> > > >> > D
> > > >> >
> > > >> > Sent from my iPhone
> > > >> >
> > > >> > On 2013-02-02, at 16:55, Ori Livneh  wrote:
> > > >> >
> > > >> > >
> > > >> > >
> > > >> > > On Saturday, February 2, 2013 at 1:36 PM, Platonides wrote:
> > > >> > >
> > > >> > >> I don't like it's cryptic nature.
> > > >> > >>
> > > >> > >> Someone looking at the headers sent to his browser would be
> very
> > > >> > >> confused about what's the point of «X-MF-Mode: b».
> > > >> > >>
> > > >> > >> Instead something like this would be much more descriptive:
> > > >> > >> X-Mobile-Mode: stable
> > > >> > >> X-Mobile-Request: secondary
> > > >> > >>
> > > >> > >> But that also means sending more bytes through the wire :S
> > > >> > > Well, you can (and should) drop the 'X-' :-)
> > > >> > >
> > > >> > > See http://tools.ietf.org/html/rfc6648: Deprecating the "X-"
> > Prefix
> > > >> and
> > > >> > Similar Constructs in Application Protocols
> > > >> > >
> > > >> > >
> > > >> > > --
> > > >> > > Ori Livneh
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > ___
> > > >> > > Wikitech-l mailing list
> > > >> > > Wikitech-l@lists.wikimedia.org
> > > >> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > >> >
> > > >> > ___
> > > >> > Wikitech-l mailing list
> > > >> > Wikitech-l@lists.wikimedia.org
> > > >> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > 

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-03 Thread Asher Feldman
That's not at all true in the real world. Look at the actual requests for
google analytics on a high percentage of sites, etc.

Setting new request headers for mobile that map to new inflexible fields in
the log stream that must be set on all non mobile requests ("\t-\t-")
equals gigabytes of unnecessarily log data every day (that we want
to save 100% of) for no good reason. Wanting to keep query params "pure"
isn't a good reason.

On Sunday, February 3, 2013, Tyler Romeo wrote:

> Considering that the query component of a URI is meant to identify the
> resource whereas HTTP headers are meant to tell the server additional
> information about the request, I think a header approach is much more
> appropriate than a no-op query parameter.
>
> If the X- is removed, I'd have no problem with the addition of these
> headers, but what is the advantage of having two over one. Wouldn't a
> header like:
> MobileFrontend: 1/2 a/b/s
> work just as fine?
>
> *--*
> *Tyler Romeo*
> Stevens Institute of Technology, Class of 2015
> Major in Computer Science
> www.whizkidztech.com | tylerro...@gmail.com 
>
>
> On Sun, Feb 3, 2013 at 4:35 AM, Asher Feldman 
> 
> >wrote:
>
> > Regarding varnish cacheability of mobile API requests with a logging
> query
> > param - it would probably be worth making frontend varnishes strip out
> all
> > occurrences of that query param and its value from their backend requests
> > so they're all the same to the caching instances. A generic param name
> that
> > can take any value would allow for adding as many extra log values as
> > needed, limited only by the uri log field length.
> >
> > &l=mft2&l=mfstable etc.
> >
> > So still an edge cache change but the result is more flexible
> > while avoiding changing the fixed field length log format across
> unrelated
> > systems like text squids or image caches.
> >
> > On Sunday, February 3, 2013, Asher Feldman wrote:
> >
> > > If you want to differentiate categories of API requests in logs, add
> > > descriptive noop query params to the requests. I.e &mfmode=2. Doing
> this
> > in
> > > request headers and altering edge config is unnecessary and a bad
> design
> > > pattern. On the analytics side, if parsing query params seems
> challenging
> > > vs. having a fixed field to parse, deal.
> > >
> > > On Sunday, February 3, 2013, David Schoonover wrote:
> > >
> > >> Huh! News to me as well. I definitely agree with that decision.
> Thanks,
> > >> Ori!
> > >>
> > >> I've already written the Varnish code for setting X-MF-Mode so it can
> be
> > >> captured by varnishncsa. Is there agreement to switch to Mobile-Mode,
> or
> > >> at
> > >> least, MF-Mode?
> > >>
> > >> Looking especially to hear from Arthur and Matt.
> > >>
> > >> --
> > >> David Schoonover
> > >> d...@wikimedia.org
> > >>
> > >>
> > >> On Sat, Feb 2, 2013 at 2:16 PM, Diederik van Liere
> > >> wrote:
> > >>
> > >> > Thanks Ori, I was not aware of this
> > >> > D
> > >> >
> > >> > Sent from my iPhone
> > >> >
> > >> > On 2013-02-02, at 16:55, Ori Livneh  wrote:
> > >> >
> > >> > >
> > >> > >
> > >> > > On Saturday, February 2, 2013 at 1:36 PM, Platonides wrote:
> > >> > >
> > >> > >> I don't like it's cryptic nature.
> > >> > >>
> > >> > >> Someone looking at the headers sent to his browser would be very
> > >> > >> confused about what's the point of «X-MF-Mode: b».
> > >> > >>
> > >> > >> Instead something like this would be much more descriptive:
> > >> > >> X-Mobile-Mode: stable
> > >> > >> X-Mobile-Request: secondary
> > >> > >>
> > >> > >> But that also means sending more bytes through the wire :S
> > >> > > Well, you can (and should) drop the 'X-' :-)
> > >> > >
> > >> > > See http://tools.ietf.org/html/rfc6648: Deprecating the "X-"
> Prefix
> > >> and
> > >> > Similar Constructs in Application Protocols
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Ori Livneh
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > ___
> > >> > > Wikitech-l mailing list
> > >> > > Wikitech-l@lists.wikimedia.org
> > >> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > >> >
> > >> > ___
> > >> > Wikitech-l mailing list
> > >> > Wikitech-l@lists.wikimedia.org
> > >> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > >
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-03 Thread Tyler Romeo
Considering that the query component of a URI is meant to identify the
resource whereas HTTP headers are meant to tell the server additional
information about the request, I think a header approach is much more
appropriate than a no-op query parameter.

If the X- is removed, I'd have no problem with the addition of these
headers, but what is the advantage of having two over one. Wouldn't a
header like:
MobileFrontend: 1/2 a/b/s
work just as fine?

*--*
*Tyler Romeo*
Stevens Institute of Technology, Class of 2015
Major in Computer Science
www.whizkidztech.com | tylerro...@gmail.com


On Sun, Feb 3, 2013 at 4:35 AM, Asher Feldman wrote:

> Regarding varnish cacheability of mobile API requests with a logging query
> param - it would probably be worth making frontend varnishes strip out all
> occurrences of that query param and its value from their backend requests
> so they're all the same to the caching instances. A generic param name that
> can take any value would allow for adding as many extra log values as
> needed, limited only by the uri log field length.
>
> &l=mft2&l=mfstable etc.
>
> So still an edge cache change but the result is more flexible
> while avoiding changing the fixed field length log format across unrelated
> systems like text squids or image caches.
>
> On Sunday, February 3, 2013, Asher Feldman wrote:
>
> > If you want to differentiate categories of API requests in logs, add
> > descriptive noop query params to the requests. I.e &mfmode=2. Doing this
> in
> > request headers and altering edge config is unnecessary and a bad design
> > pattern. On the analytics side, if parsing query params seems challenging
> > vs. having a fixed field to parse, deal.
> >
> > On Sunday, February 3, 2013, David Schoonover wrote:
> >
> >> Huh! News to me as well. I definitely agree with that decision. Thanks,
> >> Ori!
> >>
> >> I've already written the Varnish code for setting X-MF-Mode so it can be
> >> captured by varnishncsa. Is there agreement to switch to Mobile-Mode, or
> >> at
> >> least, MF-Mode?
> >>
> >> Looking especially to hear from Arthur and Matt.
> >>
> >> --
> >> David Schoonover
> >> d...@wikimedia.org
> >>
> >>
> >> On Sat, Feb 2, 2013 at 2:16 PM, Diederik van Liere
> >> wrote:
> >>
> >> > Thanks Ori, I was not aware of this
> >> > D
> >> >
> >> > Sent from my iPhone
> >> >
> >> > On 2013-02-02, at 16:55, Ori Livneh  wrote:
> >> >
> >> > >
> >> > >
> >> > > On Saturday, February 2, 2013 at 1:36 PM, Platonides wrote:
> >> > >
> >> > >> I don't like it's cryptic nature.
> >> > >>
> >> > >> Someone looking at the headers sent to his browser would be very
> >> > >> confused about what's the point of «X-MF-Mode: b».
> >> > >>
> >> > >> Instead something like this would be much more descriptive:
> >> > >> X-Mobile-Mode: stable
> >> > >> X-Mobile-Request: secondary
> >> > >>
> >> > >> But that also means sending more bytes through the wire :S
> >> > > Well, you can (and should) drop the 'X-' :-)
> >> > >
> >> > > See http://tools.ietf.org/html/rfc6648: Deprecating the "X-" Prefix
> >> and
> >> > Similar Constructs in Application Protocols
> >> > >
> >> > >
> >> > > --
> >> > > Ori Livneh
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > ___
> >> > > Wikitech-l mailing list
> >> > > Wikitech-l@lists.wikimedia.org
> >> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >> >
> >> > ___
> >> > Wikitech-l mailing list
> >> > Wikitech-l@lists.wikimedia.org
> >> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >> >
> >> ___
> >> Wikitech-l mailing list
> >> Wikitech-l@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-03 Thread Asher Feldman
Regarding varnish cacheability of mobile API requests with a logging query
param - it would probably be worth making frontend varnishes strip out all
occurrences of that query param and its value from their backend requests
so they're all the same to the caching instances. A generic param name that
can take any value would allow for adding as many extra log values as
needed, limited only by the uri log field length.

&l=mft2&l=mfstable etc.

So still an edge cache change but the result is more flexible
while avoiding changing the fixed field length log format across unrelated
systems like text squids or image caches.

On Sunday, February 3, 2013, Asher Feldman wrote:

> If you want to differentiate categories of API requests in logs, add
> descriptive noop query params to the requests. I.e &mfmode=2. Doing this in
> request headers and altering edge config is unnecessary and a bad design
> pattern. On the analytics side, if parsing query params seems challenging
> vs. having a fixed field to parse, deal.
>
> On Sunday, February 3, 2013, David Schoonover wrote:
>
>> Huh! News to me as well. I definitely agree with that decision. Thanks,
>> Ori!
>>
>> I've already written the Varnish code for setting X-MF-Mode so it can be
>> captured by varnishncsa. Is there agreement to switch to Mobile-Mode, or
>> at
>> least, MF-Mode?
>>
>> Looking especially to hear from Arthur and Matt.
>>
>> --
>> David Schoonover
>> d...@wikimedia.org
>>
>>
>> On Sat, Feb 2, 2013 at 2:16 PM, Diederik van Liere
>> wrote:
>>
>> > Thanks Ori, I was not aware of this
>> > D
>> >
>> > Sent from my iPhone
>> >
>> > On 2013-02-02, at 16:55, Ori Livneh  wrote:
>> >
>> > >
>> > >
>> > > On Saturday, February 2, 2013 at 1:36 PM, Platonides wrote:
>> > >
>> > >> I don't like it's cryptic nature.
>> > >>
>> > >> Someone looking at the headers sent to his browser would be very
>> > >> confused about what's the point of «X-MF-Mode: b».
>> > >>
>> > >> Instead something like this would be much more descriptive:
>> > >> X-Mobile-Mode: stable
>> > >> X-Mobile-Request: secondary
>> > >>
>> > >> But that also means sending more bytes through the wire :S
>> > > Well, you can (and should) drop the 'X-' :-)
>> > >
>> > > See http://tools.ietf.org/html/rfc6648: Deprecating the "X-" Prefix
>> and
>> > Similar Constructs in Application Protocols
>> > >
>> > >
>> > > --
>> > > Ori Livneh
>> > >
>> > >
>> > >
>> > >
>> > > ___
>> > > Wikitech-l mailing list
>> > > Wikitech-l@lists.wikimedia.org
>> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>> >
>> > ___
>> > Wikitech-l mailing list
>> > Wikitech-l@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>> >
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] RFC: Introducing two new HTTP headers to track mobile pageviews

2013-02-03 Thread Asher Feldman
If you want to differentiate categories of API requests in logs, add
descriptive noop query params to the requests. I.e &mfmode=2. Doing this in
request headers and altering edge config is unnecessary and a bad design
pattern. On the analytics side, if parsing query params seems challenging
vs. having a fixed field to parse, deal.

On Sunday, February 3, 2013, David Schoonover wrote:

> Huh! News to me as well. I definitely agree with that decision. Thanks,
> Ori!
>
> I've already written the Varnish code for setting X-MF-Mode so it can be
> captured by varnishncsa. Is there agreement to switch to Mobile-Mode, or at
> least, MF-Mode?
>
> Looking especially to hear from Arthur and Matt.
>
> --
> David Schoonover
> d...@wikimedia.org 
>
>
> On Sat, Feb 2, 2013 at 2:16 PM, Diederik van Liere
> >wrote:
>
> > Thanks Ori, I was not aware of this
> > D
> >
> > Sent from my iPhone
> >
> > On 2013-02-02, at 16:55, Ori Livneh >
> wrote:
> >
> > >
> > >
> > > On Saturday, February 2, 2013 at 1:36 PM, Platonides wrote:
> > >
> > >> I don't like it's cryptic nature.
> > >>
> > >> Someone looking at the headers sent to his browser would be very
> > >> confused about what's the point of «X-MF-Mode: b».
> > >>
> > >> Instead something like this would be much more descriptive:
> > >> X-Mobile-Mode: stable
> > >> X-Mobile-Request: secondary
> > >>
> > >> But that also means sending more bytes through the wire :S
> > > Well, you can (and should) drop the 'X-' :-)
> > >
> > > See http://tools.ietf.org/html/rfc6648: Deprecating the "X-" Prefix
> and
> > Similar Constructs in Application Protocols
> > >
> > >
> > > --
> > > Ori Livneh
> > >
> > >
> > >
> > >
> > > ___
> > > Wikitech-l mailing list
> > > Wikitech-l@lists.wikimedia.org 
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org 
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org 
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l