Re: json vs simplejson
On Tue, Jun 12, 2012 at 8:49 AM, Luke Plant wrote: > > I agree my existing program had a bug. I had simplejson installed > because a dependency pulled it in (which means it can be difficult to > get rid of). > > The thing I was flagging up was that the release notes say "You can > safely change any use of django.utils.simplejson to json." I'm just > saying the two differences I've found probably warrant at least some > documentation. > > The second issue is difficult to argue as a bug in my program or > dependencies. Django has moved from a providing a JSONEncoder object > that supported a certain keyword argument to one that doesn't. We could > 'fix' it to some extent: > > class DjangoJSONEncoder(json.JSONEncoder): > def __init__(self, *args, **kwargs): > kwargs.pop('namedtuple_as_object') > super(DjangoJSONEncoder, self).__init__(*args, **kwargs) > > But like that, it would create more problems if the json module ever > gained that keyword argument in the future. > Like loads(), json.JSONEncoder is just an alias for simplejson.JSONEncoder, and we need to support versions of simplejson down to 1.9 which is what python 2.6 ships with. This 'namedtuple_as_object' thing seems to only appear as of simplejson 2.2, which means that depending on it is a bug that appears on any system without a recent version of simplejson (for example, the version that was bundled with Django doesn't support it). Depending on this kwarg is a bug in Django, and should be fixed. https://github.com/simplejson/simplejson/blob/namedtuple-object-gh6/simplejson/encoder.py It's clear that people have begun to depend on the quirky ways in which simplejson diverged from its earlier codebase. I found the place where that unicode "proper behavior" was fixed, so apparently in Python's stdlib they undid the C optimizations at some point. So I was incorrect earlier, and the C speedups work "properly" with Python stdlib's patch. http://bugs.python.org/issue11982 Basically, anyone who depended on features of simplejson added after 1.9, or its wonky optimizations, already had arguably broken code in that it only worked when simplejson is installed. I'm torn as to whether we should add a note about these subtle problems when switching to json, recommend that people switch to simplejson instead, or undeprecate django.utils.simplejson as a necessary wart (we can still stop vendoring simplejson though). Best, Alex Ogier -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: json vs simplejson
On 12/06/12 13:28, Alex Ogier wrote: > Wait, 'import simplejson' works? Then that explains your problems. You > are using a library you installed yourself that has C extensions, > instead of the system json. If you switch to a system without > simplejson installed, then you should see the "proper" behavior from > django.utils.simplejson.loads(). If your program depends on some > optimized behavior of the C parser such as returning str instances > when it finds ASCII, it is bugged already on systems without > simplejson. If Django depends on optimized behavior, then it is a bug, > and a ticket should be filed. I agree my existing program had a bug. I had simplejson installed because a dependency pulled it in (which means it can be difficult to get rid of). The thing I was flagging up was that the release notes say "You can safely change any use of django.utils.simplejson to json." I'm just saying the two differences I've found probably warrant at least some documentation. The second issue is difficult to argue as a bug in my program or dependencies. Django has moved from a providing a JSONEncoder object that supported a certain keyword argument to one that doesn't. We could 'fix' it to some extent: class DjangoJSONEncoder(json.JSONEncoder): def __init__(self, *args, **kwargs): kwargs.pop('namedtuple_as_object') super(DjangoJSONEncoder, self).__init__(*args, **kwargs) But like that, it would create more problems if the json module ever gained that keyword argument in the future. Luke -- OSBORN'S LAW Variables won't, constants aren't. Luke Plant || http://lukeplant.me.uk/ -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: json vs simplejson
On Tue, Jun 12, 2012 at 7:19 AM, Luke Plant wrote: > > There is another issue I found. > > Django's DateTimeAwareJSONEncoder now subclasses json.JSONEncoder > instead of simplejson.JSONEncoder. The two are not perfectly compatible. > simplejson.dumps() passes the keyword argument 'namedtuple_as_object' to > the JSON encoder class that you pass in, but json.JSONEncoder doesn't > accept that argument, resulting in a TypeError. > > So any library that uses Django's JSONEncoder subclasses, but uses > simplejson.dumps() (either via 'import simplejson' or 'import > django.utils.simplejson') will break. I found this already with > django-piston. > Wait, 'import simplejson' works? Then that explains your problems. You are using a library you installed yourself that has C extensions, instead of the system json. If you switch to a system without simplejson installed, then you should see the "proper" behavior from django.utils.simplejson.loads(). If your program depends on some optimized behavior of the C parser such as returning str instances when it finds ASCII, it is bugged already on systems without simplejson. If Django depends on optimized behavior, then it is a bug, and a ticket should be filed. Best, Alex Ogier -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: json vs simplejson
On Jun 12, 2012 6:54 AM, "Luke Plant" wrote: > I've found the same difference of behaviour on both a production machine > where I'm running my app (CentOS machine, using a virtualenv, Python > 2.7.3), and locally on my dev machine which is currently running Debian, > using the Debian Python 2.7.2 packages. > > In both cases, json is always returning unicode objects, which implies I > don't have the C extensions for the json module according to your > analysis. I don't know enough about how this is supposed to work to > understand why. > I'm not sure why no one is getting speedups from simplejson, but I can tell you that on python 2.6+ django.utils.simplejson.loads should be an alias for json.loads: >>> import json >>> json.loads('{"a":"b"}') {u'a': u'b'} >>> from django.utils import simplejson >>> simplejson.loads('{"a":"b"}') {u'a': u'b'} >>> json.loads == simplejson.loads True Best, Alex Ogier -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: json vs simplejson
On 12/06/12 10:58, Vinay Sajip wrote: > > I'm not sure there's any easy way out, other than comprehensive > testing. There is another issue I found. Django's DateTimeAwareJSONEncoder now subclasses json.JSONEncoder instead of simplejson.JSONEncoder. The two are not perfectly compatible. simplejson.dumps() passes the keyword argument 'namedtuple_as_object' to the JSON encoder class that you pass in, but json.JSONEncoder doesn't accept that argument, resulting in a TypeError. So any library that uses Django's JSONEncoder subclasses, but uses simplejson.dumps() (either via 'import simplejson' or 'import django.utils.simplejson') will break. I found this already with django-piston. I think we at least need a bigger section in the release notes about this. Luke -- OSBORN'S LAW Variables won't, constants aren't. Luke Plant || http://lukeplant.me.uk/ -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: json vs simplejson
On 12/06/12 06:14, Alex Ogier wrote: > This seemed strange to me because the standard library json shipping > with python 2.7.3 is in fact simplejson 2.0.9, so I did some digging. > It turns out that if the C extensions have been compiled and you pass > a str instance to loads(), then you get that behavior in both > versions. This isn't documented anywhere, but here's the offending > pieces: > > http://hg.python.org/releasing/2.7.3/file/7bb96963d067/Modules/_json.c#l419 > https://github.com/simplejson/simplejson/blob/master/simplejson/_speedups.c#L527 > > If the C extensions aren't enabled, or you pass a unicode string to > loads(), then you get the "proper" behavior as documented. I'm not > sure how you are triggering this optimized, iffy behavior in > django.utils.simplejson though, without also triggering it in the > standard library. Did you ever install simplejson with 'pip install > simplejson' such that Django picked it up? Can you try running 'from > django.utils import simplejson; print simplejson.__version__'? Thanks for digging into that. (BTW, in reply to Vinay, yes I meant "from simplejson to json", not the other way around). I've found the same difference of behaviour on both a production machine where I'm running my app (CentOS machine, using a virtualenv, Python 2.7.3), and locally on my dev machine which is currently running Debian, using the Debian Python 2.7.2 packages. In both cases, json is always returning unicode objects, which implies I don't have the C extensions for the json module according to your analysis. I don't know enough about how this is supposed to work to understand why. It also implies I probably not the only one affected by this, if it's happened on two quite different machines. Looking at this discussion: http://stackoverflow.com/questions/712791/json-and-simplejson-module-differences-in-python it seems that lots of people don't have the C extension for json (reporting json 10x slower than simplejson). Luke -- OSBORN'S LAW Variables won't, constants aren't. Luke Plant || http://lukeplant.me.uk/ -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: json vs simplejson
On Jun 11, 10:51 pm, Luke Plant wrote: > We've switched internally from json to simplejson. Our 1.5 release notes > say: Do you mean the other way around? > You can safely change any use of django.utils.simplejson to json > > I just found a very big difference between json and simplejson > > >>> simplejson.loads('{"x":"y"}') > > {'x': 'y'} > > >>> json.loads('{"x":"y"}') > > {u'x': u'y'} > > i.e. simplejson returns bytestrings if the string is ASCII (it returns > unicode objects otherwise), while json returns unicode objects always. > > This was, unfortunately, a very unfortunate design decision on the part > of simplejson - json is definitely correct here - and a very big change > in semantics. It led to one very difficult to debug error for me already. Right. And on Python 3, the json module (correctly) doesn't accept byte-strings at all. > So, this is a shout out to other people to watch out for this, and a > call for ideas on what we can do to mitigate the impact of this. It's > likely to crop up in all kinds of horrible places, deep in libraries > that you can't do much about. In my case I was loading config, including > passwords, from a config file in JSON, and the password was now > exploding inside smtplib because it was a unicode object. This is one place where there are limitations in the 2.x stdlib - other places include cStringIO and cookies. For example, if you pass a Unicode object to a cStringIO.StringIO, it doesn't complain, but does the wrong thing: >>> from cStringIO import StringIO; StringIO(u'abc').getvalue() 'a\x00b\x00c\x00' >>> Fun and games ... I'm not sure there's any easy way out, other than comprehensive testing. Regards, Vinay Sajip -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: json vs simplejson
On Mon, Jun 11, 2012 at 5:51 PM, Luke Plant wrote: > > i.e. simplejson returns bytestrings if the string is ASCII (it returns > unicode objects otherwise), while json returns unicode objects always. > This seemed strange to me because the standard library json shipping with python 2.7.3 is in fact simplejson 2.0.9, so I did some digging. It turns out that if the C extensions have been compiled and you pass a str instance to loads(), then you get that behavior in both versions. This isn't documented anywhere, but here's the offending pieces: http://hg.python.org/releasing/2.7.3/file/7bb96963d067/Modules/_json.c#l419 https://github.com/simplejson/simplejson/blob/master/simplejson/_speedups.c#L527 If the C extensions aren't enabled, or you pass a unicode string to loads(), then you get the "proper" behavior as documented. I'm not sure how you are triggering this optimized, iffy behavior in django.utils.simplejson though, without also triggering it in the standard library. Did you ever install simplejson with 'pip install simplejson' such that Django picked it up? Can you try running 'from django.utils import simplejson; print simplejson.__version__'? -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: json vs simplejson
The other thing this breaks is using **kwargs with something loaded from JSON. -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/ynPxXsJZUB8J. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
json vs simplejson
Hi all, We've switched internally from json to simplejson. Our 1.5 release notes say: You can safely change any use of django.utils.simplejson to json I just found a very big difference between json and simplejson >>> simplejson.loads('{"x":"y"}') {'x': 'y'} >>> json.loads('{"x":"y"}') {u'x': u'y'} i.e. simplejson returns bytestrings if the string is ASCII (it returns unicode objects otherwise), while json returns unicode objects always. This was, unfortunately, a very unfortunate design decision on the part of simplejson - json is definitely correct here - and a very big change in semantics. It led to one very difficult to debug error for me already. So, this is a shout out to other people to watch out for this, and a call for ideas on what we can do to mitigate the impact of this. It's likely to crop up in all kinds of horrible places, deep in libraries that you can't do much about. In my case I was loading config, including passwords, from a config file in JSON, and the password was now exploding inside smtplib because it was a unicode object. Yuck. Ideas? Luke -- OSBORN'S LAW Variables won't, constants aren't. Luke Plant || http://lukeplant.me.uk/ -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.