Re: [Freeipa-devel] [PATCH 82] Compliant client side session cookie behavior

John Dennis Fri, 07 Dec 2012 09:37:33 -0800

On 11/12/2012 12:39 PM, Petr Viktorin wrote:

On 11/11/2012 11:18 PM, John Dennis wrote:

In the future please do not quote the entire patch in the email. Itmakes it much too difficult to respond. It took me a long time to removeall the unnecessary text for this reply.

New patch following in a moment, as well as the response to your secondemail.

Cookie library issues:
I decided to solve the cookie library problems by writing a minimal
cookie library that does what we need and no more than that. It is a
new module in ipapython shared by both client and server and comes
with a new unit test. The module has plenty of documentation, no need
to repeat it here.


This sucks. Designing, implementing and maintaining our own version of
general code is never good. I'd like to discuss with the other
developers so everyone knows what's happening in this patch.

Also, API design is a hard thing to do. I'd prefer if we made a subset
of an existing API (especially if there's any hope of the standard
library being fixed) rather than invented a new one.


Yes it does suck, no one was more dismayed about this than
myself. Here is bug references and expanded justification.

The inability of Cookie.py to parse an Expires attribute was opened
6/10/2008, approximately 4.5 years ago.

http://bugs.python.org/issue3073

To this day it remains open and unfixed. I discovered the bug report
when I also ran afoul of the failed behavior. It was shortly
thereafter I also discovered Cookie.py could not parse the HttpOnly
and Secure boolean flags. There are only 7 defined items in a cookie
and Cookie.py fails to parse 3 of them, close to a 50% failure
rate. The bug has stayed open 4.5 years without a fix making into the
upstream distribution. If we were dependent on getting Cookie.py fixed
we would have to make the core Python packages in RHEL 5, RHEL 6, RHEL
7, Fedora 16, Fedora 17, and Fedora 18 were patched. Tracking and
making sure the fix gets into all those distributions is a signifcant
effort, but at least we own those distributions and have some control
over them. But wait! We also expect the IPA clients to install and
successfully run on other systems as well, Debian, Ubuntu, Solaris,
etc. Who is going to make sure Cookie.py has been patched on all the
releases of all those systems? Frankly I have no confidence the
various bugs in Cookie.py will be taken seriously by upstream after
it's languished 4.5 years without a fix making it into an upstream
release. Nor is it just these 3 parse bugs. I can demonstrate many
other ways Cookie.py fails in both cookie generation and cookie
parsing.

The amount of work needed to assure patched versions of
Cookie.py was available in all the systems we hope to deploy the IPA
client on would be huge. Compare that to spending less than a day
writing a new module and unit test that works and installs with our
client. To my thinking it's a no-brainer which is the better choice
for the team.

But there is another issue with Cookie.py. It's API is horrible, it's
difficult to use and important features for working with cookies are
completely absent. That's just at the API level, it's actual
implementation is frightenly bad. You ask, why don't we at least make
the new module API compatible so we can swap it out later when
upstream finally gets around to fixing the problems. Emulating a
terrible API and then having to extend it into something usable
doesn't make sense to me, nor does it seem a good use of team
resources nor beneficial to the IPA product.

The bug I opened about the other issues in Cookie.py is:

http://bugs.python.org/issue16611

Please add the following paragraph to the cookie.py docstring, so people
know our implementation is incomplete:

It doesn't have every bell and whistle nor can it
handle all the cookie scenarios found in the wild with random servers
and browsers, it does what we need in an RFC manner (but not all RFC
features).

The implementation is complete because it follows the RFC. I was perhapsbeing too cautious. A lot of the extra code in the other librariesexists to accmodate behavior prior to standardization. Those days arehopefully long behind use. Plus our use of cookies is very constrained,we're only going to connect to a server we control. I'm not reallyworried at this juncture about historical misbehavior. I'm pretty surethe implementation will catch non-RFC compliant behavior and reject itwhich is proably the right thing to do.

The only thing I think might be missing is automatically escaping andunescaping reserved characters. But the RFC is actually mum on thistopic. It defines a legal character set which the parsing code observes.The libary does not currently validate all values being assigned to acookie do not contain invalid charaters. It's currently up to the callerto supply valid URL components for the domain and path components whichmight be subject to escaping with HTML entities. Yes, that would be agood thing to add in the future. Is it essential for getting this in? NoI don't think so. We won't assigning anything invalid into to the cookiebecause we've pre-validated everything prior to invoking this code. I'veadded a comment to this effect in the docstring.


Request URL issues:

We also had problems in rpc.py whereby information from the request
which is needed when we process the response is not available. Most
important was the requesting URL. It turns out that the way the class
and object relationships are structured it's impossible to get this
information. Someone else must have run into the same issue because
there was a routine called reconstruct_url() which attempted to
recreate the request URL from other available
information. Unfortunately reconstruct_url() was not callable from
inside the response handler. So I decided to store the information in
the thread context and when the request is received extract it from
the thread context. It's perhaps not an ideal solution but we do
similar things elsewhere so at least it's consistent. I removed the
reconstruct_url() function because the exact information is now in the
context and trying to apply heuristics to recreate the url is probably
not robust.


I'm no fan of global state, but I guess there's no other way :(

Neither am I. But this is a classic example of how the core Pythonlibraries are messed up.

+from ipapython.ipa_log_manager import *


I understand star imports are current practice for ipa_log_manager, but
I believe it is time to start changing it. Especially since you're just
using one of the 8 names you're importing.
But if you don't agree, keep it and let's save the argument for another day.


Fixed.

Here and in other places, Git complains about whitespace at end of line.


Fixed.

+    def apply_session_cookie(self, url):

This method does too much. Can you split it into one that updates the
global state (and possibly stores the URL there), and another that
modifies the URL?

Refactored by splitting out some of the generic logic intoxmlclient.get_session_cookie_from_persistent_storage()and Cookie.http_return_ok(). This makes the routine a smaller andisolates some common logic. The logic to decide if the cookie should bereturned properly belongs in the Cookie class so moving it there made alot of sense and it should have been done that way originally. The unittest was also updated to test the new functionality. Plus the logic wasexpanded to closely follow the RFC requirements.

Refactoring the way you suggest does not work because of the logic increate_connection(). I realize it's a bit awkward but I don't want torefactor too much of the existing code in search of a more elegantorganization. creation_connection() is critical code and the risk ofbreaking something in search of a more elegant organization is notjustified just prior to a release.

+        # O.K. session_cookie is valid to be returned, stash it away where it 
will will
+        # get included in a HTTP Cookie headed sent to the server.
+        self.log.debug("setting session_cookie into context '%s'", 
session_cookie)
+        setattr(context, 'session_cookie', session_cookie.http_cookie())


I don't understand why setattr is needed here. Why would a simple
      context.session_cookie = session_cookie.http_cookie()
not work?
More of the same below.

This is current practice across the entire code base. I don't know whyit's done that way. I see little value in deviating from existingpractice, that only creates more confusion without benefit. Open a codeclean-up ticket on this issue if you wish, that will assure the idiomconsistent everywhere.

-                if session_data and e.errcode == 401:
+                if context.has_key('session_cookie') and e.errcode == 401:


What is has_key? Did you mean to use hasattr?


Good catch! Yes I meant hasattr.

                       # Unauthorized. Remove the session and try again.
-                    delattr(context, 'session_data')
+                    delattr(context, 'session_cookie')


Since you're touching the code, can you switch to `del
context.session_cookie`? (again, ignore this comment if there's some
deeper magic you're working around)
More of the same below.

Again, because this is the current idiom, see above comment aboutconsistency and opening a code clean-up ticket.

+    # Debug flag for class
+    debug = False


We have a fancy logging module that allows us to easily tune loggers for
individual classes. Please use it instead of inventing a mechanism based
on class-wide flags and print statements.

Opps, sorry, that was left over from when I developed the codestand-alone and forgot to delete it. Bye-Bye!

+
+    @classmethod
+    def datetime_to_time(cls, dt):


As far as I can see, this method isn't used anywhere. Why add it?

Because it would not be unreasonable to need to convert a naive datetimeobject in UTC to a UNIX time value (in fact I needed that in the unittest). I discovered it's not obvious how to do this correctly. Iscratched my head pouring over the Python doc for the datetime and timemodules and couldn't figure it out. A Google search revealed I was notalone in having trouble figuring this out. So rather than having othersget it wrong or waste more time figuring it out I thought it reasonableto provide the utility.

+    @classmethod
+    def datetime_to_string(cls, dt=None):



Why is this a class method and not a normal function? It doesn't use cls
at all, and there's no value in overriding it in subclasses.
Same for the other classmethods.

Good question, I think my reasoning for doing it this way is validthough. Handling datetime objects with timezones is awkward, it lookslike timezone support was added after datetime objects were in use.Instead of having two datetime classes (one local and one with zonesupport) datetime objects have an internal state which makes them naiveor timezone aware. By default you get naive objects. Adding zone supportadds a bit of complexity that's not natively part of the module. This isfurther complicated by the fact you can't mix naive and aware objects(comparisons will raise exceptions, etc.). The best approach is to usethe default naive object and follow a convention over the zone they'rein (e.g. UTC).

Aside from the time zone conventions there are strict requirements tomeet RFC requirements on the string representation of these values whenused in the context of a cookie. We need some way to assure the datetimevalues adhere to the requirements of the class.

That makes it reasonable to "attach" methods observing the usagerequirement in the class to the class. Making them stand alone functionsdoes nothing to indicate nor enforce there is a specific context inwhich these need to be applied, class methods provide that context.

The other uses of class methods are several of the parsing methods. Theyare class methods because they act as a class factory returning one ormore class instances.

+        s = datetime.datetime.strftime(dt, '%a, %d %b %Y %H:%M:%S')
+        return s + ' GMT'


AFAIK `return datetime.datetime.strftime(dt, '%a, %d %b %Y %H:%M:%S
GMT')` would work just as well.


Sure, no problem, changed.

+    @property
+    def expires(self):
+        '''
+        The expiration timestamp (in UTC) as a datetime object for the
+        cookie, or None if not set.
+
+        You may assign a value of None, a datetime object, a numeric
+        UNIX timestamp (seconds since the epoch UTC) or formatted time
+        string (the latter two will be converted to a datetime object.
+        '''
+        return self._expires
+
+    @expires.setter
+    def expires(self, value):
+        if value is None:
+            self._expires = None
+        elif isinstance(value, datetime.datetime):
+            self._expires = value
+        elif isinstance(value, (int, long, float)):
+            self._expires = datetime.datetime.utcfromtimestamp(value)
+        elif isinstance(value, basestring):
+            self._expires = Cookie.parse_datetime(value)
+        else:
+            raise TypeError('value must be datetime, int, long, float, 
basestring or None, not %s' % \
+                            value.__class__.__name__)


Are the automatic conversions necessary? I'd rather call an explicit
set_expires_from_timestamp than have the attribute silently changed,
with semantics depending on the type.

This is the wrong kind of “easy to use”. It should be “easy to read and
understand”, not “less characters to type” (with magic to determine what
you meant).

Automatic conversions are valuable and useful. The value stored in theobject *must* be of the type the object expects. However it's notuncommon to have the value be in some other format. The goal is


* Make the API easy to use.

* Assure the API is used correctly.

* Not to obscure the callers logic with conversion logic every place theAPI is used. It is vastly more important to keep the calling clean andeasy to read because it's used far more often than the single place theclass is implemented.

It is correct to put this logic in exactly one location, namely theclass implementation. If the class implementation has a bit ofcomplexity that's appropriate, far better the complexity is in the classand not widely distributed across all uses of the class where it willmake reading the logic harder and be prone to incorrect usage. Automatictype conversion is common practice in other object orientated languagefor good reason.

+    def set_attr(self, name, value):
+        '''
+        Sets one of the predefined cookie attributes.
+        '''
+        attr_name = Cookie.attrs.get(name.lower(), None)
+        if attr_name is None:
+            raise ValueError("unknown cookie attribute '%s'" % name)
+        setattr(self, attr_name, value)


Please don't include this in the public interface, there's no need for
users to call it.
It would be much better to inline the function in the only place that
calls it.

Good point but I'd rather see it separated in a subroutine rather thancluttering up the other logic. I made it a class private method.

+
+    def __str__(self):
+        result = "%s=%s;" % (self.key, self.value)
+
+        if self.domain is not None:
+            result += " Domain=%s;" % self.domain
+
+        if self.path is not None:
+            result += " Path=%s;" % self.path
+
+        if self.max_age is not None:
+            result += " Max-Age=%s;" % self.max_age
+
+        if self.expires is not None:
+            result += " Expires=%s;" % Cookie.datetime_to_string(self.expires)
+
+        if self.secure:
+            result += " Secure;"
+
+        if self.httponly:
+            result += " HttpOnly;"
+
+        return result[:-1]      # strip trainling semi-colon


Concatenating strings is slow; please consider building a list and then
`return '; '.join(result)`.


O.K., fixed.



--
John Dennis <jden...@redhat.com>

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] [PATCH 82] Compliant client side session cookie behavior

Reply via email to