Re: Adding a middleware to match cookies

2017-01-07 Thread Tobias McNulty
On Jan 7, 2017 11:41 PM, "Jeff Willette"  wrote:

the specific case I am talking about deals with google analytics cookies,
which are different for every user and sent with the request. When
accessing request.user, I really only care about sessionid and csrftoken,
if present. So sending a vary by cookie header back will cause all the
unauthed/unsessioned users to miss the cache because of the GA cookies.

Since I have no use for these cookies in my code, and they are only used
for external requests to GA, eliminating them somewhere (earlier the
better) should improve cache hits, right?


Perhaps, but the place to do that is in your edge cache servers, not Django:

* http://www.varnish-cache.org/docs/3.0/tutorial/cookies.html
* https://web.archive.org/web/20151031024029/https://www.fastl
y.com/blog/how-to-cache-with-tracking-cookies

I'm unclear how feasible this is (I've never tried it). It's with noting
the last page isn't even on Fastly's public site anymore.

In any event, I'm not seeing the case for a change to Django proper here.
If Django's cache middleware is the only cache you're using, you might be
able to accomplish something like the above via middleware, as Carl
suggested.

If you're looking for assistance with the middleware implementation, I
recommend the django-users list. If you're using a another cache in front
of Django, you'll need to figure how to implement this there, or find a
simpler route such as never setting the tracking cookies in the first
place, or splitting the request in two.

Good luck!

Tobias

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMGFDKQSHNzAKfQj7CEnxKWn7S%2B%2Bq3%2B5v%2BeiQOjJ-nPPA76CgA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a middleware to match cookies

2017-01-07 Thread Jeff Willette
the specific case I am talking about deals with google analytics cookies, 
which are different for every user and sent with the request. When 
accessing request.user, I really only care about sessionid and csrftoken, 
if present. So sending a vary by cookie header back will cause all the 
unauthed/unsessioned users to miss the cache because of the GA cookies.

Since I have no use for these cookies in my code, and they are only used 
for external requests to GA, eliminating them somewhere (earlier the 
better) should improve cache hits, right?

On Saturday, January 7, 2017 at 8:25:10 PM UTC+9, Florian Apolloner wrote:
>
> Hi Jeff,
>
> On Saturday, January 7, 2017 at 3:50:56 AM UTC+1, Jeff Willette wrote:
>>
>> What if there was an optional middleware early in the request processing 
>> that matched cookies based on a regex in settings and then modified the 
>> header to only include the matched cookies? 
>>
>
> I do not see how this would help -- you'd still have to set "Vary: Cookie" 
> on the response as soon as you are accessing request.user. Or is the goal 
> of this to allow Django's internal page caching stuff to ignore some 
> cookies? That seems doable, but very very dangerous.
>
> This issue reminds me of another issue I came up with (or as Carl puts it: 
> "…presenting the hypothetical case that exposed this bug."), namely 
> https://code.djangoproject.com/ticket/19649 -- Basically as soon as 
> Django accesses __any__ cookie we should set "Vary: Cookie", with all the 
> downsides this entails. I think we finally should fix that and put a fix 
> for it into the BaseHandler.
>
> What would be great would be an HTTP header which allowed for something 
> ala "Cache: if-request-did-not-have-cookies" -- usually it is pointless to 
> cache __anything__ with cookies anyways. That said, with all the analytics 
> super cookies out there, there are not many pages without cookies anymore :(
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7d443c5e-5f70-421f-a44c-82dd6d71e477%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a middleware to match cookies

2017-01-07 Thread Carl Meyer
On 01/07/2017 03:25 AM, Florian Apolloner wrote:
> On Saturday, January 7, 2017 at 3:50:56 AM UTC+1, Jeff Willette wrote:
> 
> What if there was an optional middleware early in the request
> processing that matched cookies based on a regex in settings and
> then modified the header to only include the matched cookies?
> 
> 
> I do not see how this would help -- you'd still have to set "Vary:
> Cookie" on the response as soon as you are accessing request.user. Or is
> the goal of this to allow Django's internal page caching stuff to ignore
> some cookies? That seems doable, but very very dangerous.

Right, the latter is how I understood it; you'd still use Vary: Cookie,
but strip some cookies before the request reaches the cache middleware.

I don't think it's too dangerous, if you're conservative about the
cookies you strip (e.g. only strip cookies that are known for sure to be
unused on the server, like Google Analytics cookies for instance.)

> 
> This issue reminds me of another issue I came up with (or as Carl puts
> it: "…presenting the hypothetical case that exposed this bug."), namely
> https://code.djangoproject.com/ticket/19649 -- Basically as soon as
> Django accesses __any__ cookie we should set "Vary: Cookie", with all
> the downsides this entails. I think we finally should fix that and put a
> fix for it into the BaseHandler.

+1

> What would be great would be an HTTP header which allowed for something
> ala "Cache: if-request-did-not-have-cookies" -- usually it is pointless
> to cache __anything__ with cookies anyways. That said, with all the
> analytics super cookies out there, there are not many pages without
> cookies anymore :(

+1. Basically analytics have already effectively broken HTTP caching as
it was designed to work.

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/709febeb-8336-aafa-7faa-74d1e2b46802%40oddbird.net.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: OpenPGP digital signature


Re: Adding a middleware to match cookies

2017-01-07 Thread Carl Meyer
On 01/06/2017 11:26 PM, Jeff Willette wrote:
> Wy would this not help the efficiency of the downstream caches? Is it
> because the request has already passed through them with the cookies
> intact? and when it comes back through the response they have no way to
> know they have been stripped?

That's correct. Stripping cookies from the request in Django is far too
late to have any effect on an external cache. If the request has reached
Django, then it's already passed through any external caching proxies,
with all cookies, and the cache has already decided not to serve a
cached response. (And if the cache holds on to the response, it'll
associate with the the request it saw, which still had all its cookies).

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/d1383be4-749e-3183-4354-ac0047bd0517%40oddbird.net.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: OpenPGP digital signature


Re: Adding a middleware to match cookies

2017-01-07 Thread Florian Apolloner
Hi Jeff,

On Saturday, January 7, 2017 at 3:50:56 AM UTC+1, Jeff Willette wrote:
>
> What if there was an optional middleware early in the request processing 
> that matched cookies based on a regex in settings and then modified the 
> header to only include the matched cookies? 
>

I do not see how this would help -- you'd still have to set "Vary: Cookie" 
on the response as soon as you are accessing request.user. Or is the goal 
of this to allow Django's internal page caching stuff to ignore some 
cookies? That seems doable, but very very dangerous.

This issue reminds me of another issue I came up with (or as Carl puts it: 
"…presenting the hypothetical case that exposed this bug."), namely 
https://code.djangoproject.com/ticket/19649 -- Basically as soon as Django 
accesses __any__ cookie we should set "Vary: Cookie", with all the 
downsides this entails. I think we finally should fix that and put a fix 
for it into the BaseHandler.

What would be great would be an HTTP header which allowed for something ala 
"Cache: if-request-did-not-have-cookies" -- usually it is pointless to 
cache __anything__ with cookies anyways. That said, with all the analytics 
super cookies out there, there are not many pages without cookies anymore :(

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/91d681aa-2bf2-46b5-820c-176a04b4499e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a middleware to match cookies

2017-01-06 Thread Jeff Willette
Carl, thanks for the reply. 

Wy would this not help the efficiency of the downstream caches? Is it 
because the request has already passed through them with the cookies 
intact? and when it comes back through the response they have no way to 
know they have been stripped?

On Saturday, January 7, 2017 at 12:02:30 PM UTC+9, Carl Meyer wrote:
>
> Hi Jeff, 
>
> On 01/06/2017 06:21 PM, Jeff Willette wrote: 
> > I understand that calling is_authenticated on a user will require the 
> > session to be accessed and the vary by cookie header to be in the 
> > response, but if I understand how caching systems work then this will 
> > cause all cookies in the request to be taken into account, correct? 
>
> Yes. HTTP doesn't provide any way to say "vary only on this cookie, not 
> the others." Be nice if it did! 
>
> > What if there was an optional middleware early in the request 
> > processing that matched cookies based on a regex in settings and then 
> > modified the header to only include the matched cookies? 
> > 
> > That way...the unauthed users request will vary by cookies, but we 
> > would have removed all inconsequential cookies so all unauthed users 
> > will have the same set of cookies (likely none), and authed users 
> > will have (sessionid) or whatever else you wish to match and everyone 
> > will be happily cached correctly. 
> > 
> > Is there a hole in my thinking anywhere? Would this work as I 
> > expect? 
>
> I think it could work, yeah. It won't help the efficiency of any other 
> downstream HTTP caches, but they would still be safe (not serve anyone 
> the wrong response). And you should be able to help efficiency of 
> Django's own cache this way, if you strip cookies that Django / your 
> code doesn't care about before the request ever reaches the caching 
> middleware. Try it and experiment! 
>
> Carl 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/b2b46018-f3e0-45c4-9161-cd68ecc9a1ea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Adding a middleware to match cookies

2017-01-06 Thread Carl Meyer
Hi Jeff,

On 01/06/2017 06:21 PM, Jeff Willette wrote:
> I understand that calling is_authenticated on a user will require the
> session to be accessed and the vary by cookie header to be in the
> response, but if I understand how caching systems work then this will
> cause all cookies in the request to be taken into account, correct?

Yes. HTTP doesn't provide any way to say "vary only on this cookie, not
the others." Be nice if it did!

> What if there was an optional middleware early in the request
> processing that matched cookies based on a regex in settings and then
> modified the header to only include the matched cookies?
> 
> That way...the unauthed users request will vary by cookies, but we
> would have removed all inconsequential cookies so all unauthed users
> will have the same set of cookies (likely none), and authed users
> will have (sessionid) or whatever else you wish to match and everyone
> will be happily cached correctly.
> 
> Is there a hole in my thinking anywhere? Would this work as I
> expect?

I think it could work, yeah. It won't help the efficiency of any other
downstream HTTP caches, but they would still be safe (not serve anyone
the wrong response). And you should be able to help efficiency of
Django's own cache this way, if you strip cookies that Django / your
code doesn't care about before the request ever reaches the caching
middleware. Try it and experiment!

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/65281bd7-a2f6-d428-9743-683714c83057%40oddbird.net.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: OpenPGP digital signature


Adding a middleware to match cookies

2017-01-06 Thread Jeff Willette
I recently proposed a bad fix (https://code.djangoproject.com/ticket/27686) but 
I think the problem still remains and I might have a way arpund it.

I understand that calling is_authenticated on a user will require the session 
to be accessed and the vary by cookie header to be in the response, but if I 
understand how caching systems work then this will cause all cookies in the 
request to be taken into account, correct?


The idea in the ticket about using ajax requests is good, but i would prefer to 
keep the page differences in the django logic and avoid an extra request to the 
server on every page load.

What if there was an optional middleware early in the request processing that 
matched cookies based on a regex in settings and then modified the header to 
only include the matched cookies?

That way...the unauthed users request will vary by cookies, but we would have 
removed all inconsequential cookies so all unauthed users will have the same 
set of cookies (likely none), and authed users will have (sessionid) or 
whatever else you wish to match and everyone will be happily cached correctly.

Is there a hole in my thinking anywhere? Would this work as I expect?

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/422d07a9-daac-4b3d-9f85-5cdd338fd8a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.