There definitely is - it's done for mobile, for example - and Christian and
I discussed it when I was experimenting with the sampled logs - but I can't
find the thread right now. Bah :/

On 12 December 2014 at 09:41, Andrew Otto <[email protected]> wrote:
>
> There must be some way to tag traffic as https or not from at the nginx or
> varnish level, no?  Has anyone looked into this?
>
>
> On Dec 11, 2014, at 18:27, Oliver Keyes <[email protected]> wrote:
>
>
>
> On 11 December 2014 at 11:52, Christian Aistleitner <
> [email protected]> wrote:
>
>> Hi Oliver,
>>
>> On Wed, Dec 10, 2014 at 08:22:18PM -0500, Oliver Keyes wrote:
>> > So, we've had conversations about detecting SSL terminators, for two
>> > reasons:
>> > [...]
>> > So: what's the right approach? How do we find these things easily and
>> > automagically.
>>
>> The “right” approach depends a bit on the stream that you're looking
>> at. But I figure you're mostly interested in Hive data (for different
>> streams, there are other methods).
>>
>> More or less the same question got asked on the internal list on
>> Sunday. There I pointed towards pybal:
>>
>> On Sun, Dec 07, 2014 at 12:59:27PM +0100, Christian Aistleitner wrote:
>> > Hi,
>> >
>> > On Fri, Dec 05, 2014 at 03:23:45PM -0600, Aaron Halfaker wrote:
>> > > And wrote up some
>> > > brief notes in http://etherpad.wikimedia.org/p/ssl_terminators
>> >
>> > In that etherpad you wrote:
>> >
>> > Etherpad> * Scan through:
>> https://github.com/wikimedia/operations-puppet/blob/production/manifests/site.pp
>> > Etherpad> * Look for anything with role::cache::*
>> >
>> > [...]
>> >
>> > If you want even less puppet munging, and a more robust format, you
>> > can instead go to pybal directly.
>> >
>> >   http://config-master.wikimedia.org/pybal/
>> >
>> > . For example
>> >
>> >   http://config-master.wikimedia.org/pybal/esams/text-https
>>
>> I think that still holds true.
>>
>> Does that approach not work, or are you just trying to get the
>> response to the public list? ;-)
>>
>> If it's the former, please let me know where you think this approach
>> is failing.
>>
>> If it's the latter ... yay for using the public list! ... here you
>> go. It's on the public list :-D
>>
>>
> "yes" :D. I want to make these conversations public, and for us to bias
> more towards using the public list - but there was also a point of
> confusion on how we detected these machines, using puppet. If pybal
> clarifies it, yay!
>
> I'm not sure how to interpret the pybal, but that's probably because my
> explanation of the problem was tremendously unclear. Essentially; we want
> to be excluding internal IP spaces, because that contains a lot of
> automatically-generated traffic (fundraising, I'm looking at you). So, we
> exclude all requests from IPs within our ranges. Except, then we also
> exclude all the SSL traffic, since that will appear to come from an
> internal IP address, from the point of view of the request logs.
>
> So, do I interpret this pybal as: if it's tagged as HTTPS, it's an SSL
> terminator, and so requests from those machines, from internal IP
> addresses, should be included? Or: those are the SSL machines, find out
> their IP addresses and you find out the internal IPs that represent SSLd
> requests, rather than internally-generated traffic?
>
>
>
>> Have fun,
>> Christian
>>
>>
>> --
>> ---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
>>                            Companies' registry: 360296y in Linz
>> Christian Aistleitner
>> Kefermarkterstrasze 6a/3     Email:  [email protected]
>> 4293 Gutau, Austria          Phone:          +43 7946 / 20 5 81
>>                              Fax:            +43 7946 / 20 5 81
>>                              Homepage: http://quelltextlich.at/
>> ---------------------------------------------------------------
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>  _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>

-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to