Greeting TG Devs, 
  you might have noticed that a few months ago, TG 2.1 started rendering
Alchemy result sets as heavily nested json structures.  Before I delve
into what I mean by "heavily nested", I have to mention that the
reason for doing so is very valid and I invite you to review this
article if you haven't read it yet:

http://haacked.com/archive/2008/11/20/anatomy-of-a-subtle-json-vulnerability.aspx

Mercurial commit 3579effadb7b implements a solution to that
vulnerability by making sure that object are not represented by top
level lists.  Indeed, a top level json list is valid javascript but a
top level dict is not.  Therefore, an Alchemy ResultProxy is now
rendered like `{count: 12, rows: []}` and a RowProxy is seen as a
degenerated ResultProxy with count=1, so we get something like:
`{count:1, rows:[{name:"Bob", id:1}]}`.  And you probably seen where
I'm going and what I mean by heavily nested because your typical
ResultProxy will look like the following:
  
  {count: 12, rows: [
    {count:1, rows:[{name:"Bob", id:1}]},
    {count:1, rows:[{name:"Jane", id:2}]},
    {count:1, rows:[{name:"Mike", id:3}]},
    // ...
  ]}

As you can imagine, unpacking that on the client side is not fun and
thats why I started my quest for an alternative solution.  Keep in
mind that for the attack to work, you need a controller action that
returns json with a top level list on a GET request.

First thing to realize is that commit 3579effadb7b is not a complete
solution.  RowTuple still renders as a list and there are probably many
more objects out these that implements __json__() without knowledge of
that vulnerability.

One way to get rid of top level lists without shoving a lot of special
cases inside the jsonify rendering is to wrap all the json responses
in a top level dict container:

  {response: ...}

That way our earlier example becomes:

  {reponse: [{name: "Bob", id: 1}, 
             {name: "Jane", id: 2}, 
             // ...
             ]}

The drawback of this one is that it break all the existing code but
since 2.1 is not released yet, there must not no be much such code.

Another solution would be to make tg.jsonify stateful: an object would
not have the same representation if it's at the top level that the
representation that it get when nested.  In that case, we get:

    {count:12, rows: [{name: "Bob", id: 1}, 
                      {name: "Jane", id: 2}, 
                      // ...
                      ]}

Thats pretty good but it might be tricky to change the client code
radically when a response object moves from top level to a nested object.

Yet another solution, and the cleanest of all in my opinion, would be
to acknowledge the fact that all modern browsers are no longer
vulnerable and to offer the choice to the developer between heavy json
and clean json that leaves users of very old browsers vulnerable.

One way to implement that would be to have two `...@expose()`
definitions: one for clean json and the other for safe json.  The
clean json would emit warnings with `warn()` when a top level list is
rendered and it would be up to the developer to decide if he wants to
wrap his response in a top level container, to move to the heavy json,
or to accept the risk.  Here's an example:

  @expose('json')
  def foo(self):
      return model.DBSession.query(model.Emp.name, model.emp.id).all()

Produces a warning but emits

  [{name: "Bob", id: 1}, 
   {name: "Jane", id: 2}, 
   // ...
  ]

The heavy markup way:

  @expose('safejson')
  def foo(self):
      return model.DBSession.query(model.Emp.name, model.emp.id).all()

Produces:

  {count: 12, rows: [
    {count: 1, rows: [{name:"Bob", id:1}]},
    {count: 1, rows: [{name:"Jane", id:2}]},
    {count: 1, rows: [{name:"Mike", id:3}]},
    // ...
  ]}

The top-level container way:

  @expose('json')
  def foo(self):
      q = model.DBSession.query(model.Emp.name, model.emp.id)
      return dict(rows=q.all())

Does not produce a warning and emits:

  {rows: [{name: "Bob", id: 1}, 
          {name: "Jane", id: 2}, 
          // ...
          ]}

Now is the time to tackle this problem.  After 2.1 is out, it will be
too late.  What do you guys think of all these alternative solutions
to the current heavy json?

-- 
Yannick Gingras
http://ygingras.net
http://confoo.ca -- track coordinator
http://montrealpython.org -- lead organizer

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to