On 11/07/2017 01:44 AM, Mikhail Popov wrote:
> By the way, the referer header would only have the search query if the user
> was using Google/Bing/etc. over HTTP, not HTTPS. For Google searchers using
> HTTPS, we'd only see they came from "https://www.google.com/";, due to
> Google's "origin" meta referer setting (
> https://w3c.github.io/webappsec-referrer-policy/#referrer-policy-origin)
> 
> Since Google & Bing force you into HTTPS, we actually only end up with
> search queries from a few people who use very out of date browsers that
> don't support meta referer or HTTPS, since the latest versions of major
> browsers now do (
> https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer#Browser_compatibility)
> So keep in mind that any retrieved data would be unrepresentative of
> overall population, but it doesn't look like Lars is planning to do any
> statistical analysis.

Correct, though I would be open to someone with the skill and interest
helping, I don't plan to do any statistical analysis myself.

About the Referer header, from what I read the header is not sent only
if "an unsecured HTTP request is used and the referring page was
received with a secure protocol (HTTPS)" [1]  That should be rare since
the search engines now redirect to HTTPS right away and people would be
entering their search terms in a form submitted over HTTPS.

A spot check of a few browsers shows the Refer header in use for these
at least when using HTTPS:

        Mozilla/5.0 (X11; Linux x86_64; rv:56.0) Gecko/20100101
                Firefox/56.0
        Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
                Firefox/52.0
        Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/538.1 (KHTML,
                like Gecko) QupZilla/1.8.9 Safari/538.1
        Mozilla/5.0 (X11; CrOS x86_64 9765.85.0) AppleWebKit/537.36
                (KHTML, like Gecko) Chrome/61.0.3163.123 Safari/537.36
        AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.75
                Safari/537.36
        Lynx/2.8.9dev.11 libwww-FM/2.14 SSL-MM/1.4.1 GNUTLS/3.5.6

The Referrer Policy is in the draft stage [2] and might or might not
affect source web sites in the future, but if it does it looks like it
is a very long way from becoming widely deployed, years if ever.  So it
is unlikely to be a factor in Q1 2018 or Q2 2018

> Another thing I'd note is that a search term may still contain sensitive
> information even outside the context of the rest of the search query. A
> phone number or an email address might show up as a single search term, and
> that's still PII.

It may be possible.  Any suggestions on work-arounds other than manual
intervention on the database results?

/Lars

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer

[2] https://w3c.github.io/webappsec-referrer-policy/#referrer-policy-origin

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to