Re: [Analytics] [Wiki-research-l] Geo-aggregation of Wikipedia page views: Maximizing geographic granularity while preserving privacy – a proposal

Dan Andreescu Tue, 13 Jan 2015 12:31:10 -0800

+1 Aaron

On Tue, Jan 13, 2015 at 3:24 PM, Aaron Halfaker <[email protected]>
wrote:


> Andrew,
>
> I think it is reasonable to assume that the "Do not track" header isn't
> referring to this.
>
> From http://donottrack.us/ with emphasis added.
>
>> Do Not Track is a technology and policy proposal that enables users to
>> opt out of *tracking by websites they do not visit*, [...]
>
>
> Do not track is explicitly for third party tracking.  We are merely
> proposing to count those people who do access our sites.  Note that, in
> this case, we are not interested in obtaining identifiers at all, so the
> word "track" seems to not apply.
>
> It seems like we're looking for something like a "Do Not Log Anything At
> All" header.  I don't believe that such a thing exists -- but if it did I
> think it would be good if we supported it.
>
> -Aaron
>
> On Tue, Jan 13, 2015 at 2:03 PM, Andrew Gray <[email protected]>
> wrote:
>
>> Hi Dario, Reid,
>>
>> This seems sensible enough and proposal #3 is clearly the better
>> approach. An explicit opt-in opt-out mechanism would not be worth the
>> effort to build and would become yet another ignored preferences
>> setting after a few weeks...
>>
>> A couple of thoughts:
>>
>> * I understand the reasoning for not using do-not-track headers (#4);
>> however, it feels a bit odd to say "they probably don't mean us" and
>> skip them... I can almost guarantee you'll have at least one person
>> making a vocal fuss about not being able to opt-out without an
>> account. If we were to honour these headers, would it make a
>> significant change to the amount of data available? Would it likely
>> skew it any more than leaving off logged-in users?
>>
>> * Option 3 does releases one further piece of information over and
>> above those listed - an approximate ratio of logged in versus
>> non-logged-in pageviews for a page. I cannot see any particular
>> problem with doing this (and I can think of a couple of fun things to
>> use it for) but it's probably worth being aware.
>>
>> Andrew.
>>
>> On 13 January 2015 at 07:26, Dario Taraborelli
>> <[email protected]> wrote:
>> > I’m sharing a proposal that Reid Priedhorsky and his collaborators at
>> Los Alamos National Laboratory recently submitted to the Wikimedia
>> Analytics Team aimed at producing privacy-preserving geo-aggregates of
>> Wikipedia pageview data dumps and making them available to the public and
>> the research community. [1]
>> >
>> > Reid and his team spearheaded the use of the public Wikipedia pageview
>> dumps to monitor and forecast the spread of influenza and other diseases,
>> using language as a proxy for location. This proposal describes an
>> aggregation strategy adding a geographical dimension to the existing dumps.
>> >
>> > Feedback on the proposal is welcome on the lists or the project talk
>> page on Meta [3]
>> >
>> > Dario
>> >
>> > [1]
>> https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_pageviews
>> > [2] http://dx.doi.org/10.1371/journal.pcbi.1003892
>> > [3]
>> https://meta.wikimedia.org/wiki/Research_talk:Geo-aggregation_of_Wikipedia_pageviews
>> > _______________________________________________
>> > Analytics mailing list
>> > [email protected]
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>>
>> --
>> - Andrew Gray
>>   [email protected]
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Re: [Analytics] [Wiki-research-l] Geo-aggregation of Wikipedia page views: Maximizing geographic granularity while preserving privacy – a proposal

Reply via email to