Denny:

Best list to ask these kinds of questions is analytics@ (cc-ed).

>A minor question - could you also count the number of unique recurring
user agents per month? I.e. the number of visits that return and have a
still valid cookie (e.g. by >marking the cookie after the count).
mmm...Not sure what you mean by "recurring" as you can have thousands of
people with the same user agent, right? Think "everyone in Seattle with an
iPhone and the latest OS using Safari" . You can add other pieces of info
like IP, but in mobile and due to NAT-ing [1] that can also mean a group of
thousands of people. So it will always under-report heavily the number of
unique devices if you use "recurring user agents" as base for your main
calculation.

Now, I might be missing something as your question is brief, maybe you can
elaborate a bit more ?


>I am worried that the current number, due to the freshness offset  might
be overreporting
Since the offset calculation takes IP into account when looking for
freshness and it only keeps devices having 1 event without cookies and 0
with cookies the calculation is likely to under-report in mobile, due to,
again, NAT-ing and user agents being shared among many devices. We see this
on our data as smaller offset numbers in mobile projects than desktop
projects. Now, this methodology might over report for a user that uses many
distinct IPS, same browser, does 1 request and clears cookies after every
session, now this is a far less often a common of a scenario.

Hopefully this makes sense.


>Again, congratulations on the work! I am really happy to see the WMF not
being dependent on a commercial traffic numbers provider anymore!
Many thanks for reading!




 [1] https://en.wikipedia.org/wiki/Network_address_translation













On Fri, Apr 8, 2016 at 10:30 AM, Denny Vrandečić <[email protected]>
wrote:

> Hi Nuria, Aaron,
>
> first congratulations on the Unique devices work! I am really impressed by
> the solution and the dataset. I am looking forward to the visualizations
> that will come out from this.
>
> A minor question - could you also count the number of unique recurring
> user agents per month? I.e. the number of visits that return and have a
> still valid cookie (e.g. by marking the cookie after the count).
>
> My reasoning is the following: knowing well that it would possibly further
> underreport the number of unique user agents, it would get rid of all user
> agents that clean their cookies out or that use some form of incognito
> mode. It would only count people who have been there, got a cookie,
> returned, and then we mark the cookie, and don't count them further until
> it expires.
>
> I am worried that the current number, due to the freshness offset [1],
> might be overreporting, and I do not agree fully with your reasoning in
> that page that this is OK. Counting only the recurring ones would clean
> that up, give a more reliable number, although it would potentially
> underreport the people who indeed only come once a month (a number I don't
> expect to be too large).
>
> It would be interesting to see these two numbers side by side.
>
> Again, congratulations on the work! I am really happy to see the WMF not
> being dependent on a commercial traffic numbers provider anymore!
>
> Cheers,
> Denny
>
>
> [1]
> https://wikitech.wikimedia.org/wiki/Analytics/Unique_Devices/Last_access_solution#How_big_of_a_percentage_does_the_offset_represent_from_the_total.3F
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to