[Trac-dev] Context Refactoring

John Hampton Tue, 25 Sep 2007 01:32:12 -0700

As is evident from previous messages, there is a big question regarding
the whole "context" stuff.[1]


I believe that the general consensus is to clean up the context stuff by
refactoring it.[2]

This post is to throw fuel on that fire, so that it doesn't languish in
back alleys or get swept under the rug.

I am writing this as one that has a pressing need for the new security
model, yet is unfamiliar with the context object as-is.  I'm trying to
understand what it should be.  My goal is to:
  a) understand what the "context" object should be
  b) bring the discussion out in the open

cmlenz created a google docs file where he started jotting down his
thoughts, and invited any devs that wanted, to view it.  osimons,
athomas, and cboos have all commented on it.  While I'm not going to
reproduce it in its entirety, I am going to base my summary on it.
Specifically, trying to incorporate the comments left by osimons, aat,
and cboos.  I am really interested in feedback and help in filling in my
lack of understanding.

<start really long post>

So the original goals of the context were:
  * fix for rendering relative links
  * enable fine-grained permissions
  * encapsulate "everything needed" (env, req, db) in one object

This has lead to the cumbersome Context object that is currently in trunk.

We're all for tossing out the last goal of encapsulating everything in
the Context object

That leaves us with two semi-related purposes of the Context object:
  1. permissions
  2. link rendering

The basis for these is the idea of a resource descriptor.  Each resource
can be identified by a realm+identifier.

cmlenz's proposal is
{{{
  A resource descriptor is simply a list of tuples, for example:

     [('wiki', ('WikiStart', 1))]
     [('wiki', ('WikiStart', 1)), ('attachment', 'README.txt')]
     [('ticket', 42), ('attachment', 'patch.diff')]

The list represents the containment hierarchy of a resource: for
example, the attachment in the second list belongs to the "WikiStart"
wiki page.

The individual tuples in a descriptor list are of the form (realm,
identifier), where "realm" is a string, and "identifier" is any
serializable and hashable object, such as a string or tuple. For
example, wiki pages require both a page name and a version number to be
uniquely identified, whereas attachments only require the filename.
}}}

cmlenz thinks that simple data structures should be used due to their
sufficiency and low overhead as descriptors will be numerous.  As
descriptors will be used frequently, a convenience function
`.descriptor` could be added to the model objects so that it would be
easy to get the descriptor of an object.

cboos tends to disagree with the use of simple data structures due to
that being more of a resource identifier, rather than resource
descriptor [more about this when talking about link rendering] and the
cumbersomeness of manipulating lists of tuples.

cboos's alternative is to use a simple object that provides properties
for `.realm`, `.id`, and `.version`.  He contends that this will be
easier to extend to `.project` (multi-project support) or something in
the future.

I tend to like the use of simple data structures though admit that
`.realm`, etc. are easier to use.  However, I fail to see how the object
will convey the hierarchy of a resource properly or easily.

osimons also suggested the possibility of using URI paths for the
resource descriptors. For example: 'attachment/ticket/2048/2048.patch'

cboos raises concerns about this due to the need to parse the path each
time the descriptor needed to be evaluated and the ambiguity of the use
of '/'.

While I kind of agree with the parsing bit, I don't agree with the
ambiguity.  The simple solution would be to URI encode the string.  Then
'/'s are always separators, etc.  However, it could lead to some really
ugly resource descriptors.

With the idea of a resource descriptor/identifier/???[3] in hand, we can
tackle permissions and link rendering

= 1. Permissions =

Due to the presence of the resource descriptor, the permission subsystem
can simply check whether or not the user has access to the specified
resource.  It doesn't appear that there would need to be many changes to
the permission system.  The biggest problem right now, is that many
calls are made without specifying a realm.  I think that while a full
resource descriptor may not be necessary, a realm should be required.

cmlenz suggests:
{{{
  1. req.perm(realm, id=None)
  2. If realm is a list, treat it as a fully specified resource descriptor.
  3. Otherwise treat (realm, id) as the descriptor.
}}}

osimons brings up the view that access to a resource shouldn't be
allowed without the correct permissions.  This can be accomplished by
using the "context" (resource descriptor?) as the access method to
resources.  This would eliminate the need to plugin authors to remember
to check security, or be able to horribly violate it (unconsciously, at
least).

aat mentions his ideal fantasy land Trac would have three layers internally:
{{{
  1.   The underlying data model (model.py) for raw data access.
  2. An internal API for accessing the data while applying permissions,
as well as other more complex manipulations of internal resources. Kind
of like how WikiSystem and TicketSystem exist today, but better defined
and more stable, and applied across all Trac modules.
  3. Then finally the API consumer, which would include the Trac web
interface as well as third party plugins.
}}}

but notes that this is out of scope for the context refactoring as the
context object hasn't really change how the permission system is used.

cmlenz agrees with aat that that is out of scope.

One of the other benefits to a simple resource descriptor as the basis
of the permission system is that it removes the dependence on the
request object.  It is agreed that permission decisions should be based
solely on the user accessing the resource and the resource in question.

cboos raises the point that if the permission system isn't going to be
tied to the request object, then there is no reason to use req.perm as
the main entry point (other than backwards compatibility). He proposes
the rendering context [see below about link rendering] as a possibilty
for storing the permission info.  Or creating a `User` object to wrap
the req.authname and permission cache.

I agree that if the permissions aren't tied to a request (which I agree
they shouldn't) then using req.perm is kind of counter-intuitive.
However, due to the need to release 0.11 sometime this millennium, I
think that sticking with req.perm is the better choice.  I think that a
separate class make sense, but should be slated for 0.12 or 0.13.

cboos also proposes a solution to providing access to resources only if
the user has permissions.  More about this below under "Appendix R -
Resource Class"

= 2. Link Rendering =

The `Context` stuff has done the following:
{{{
Thanks to Wiki rendering contexts, the following issues could be solved:

  * relative attachments and comments TracLinks now always refer to the
correct context resource, irrelevant from where they are displayed
(ticket query with description #3711, headings #3842, timeline, etc.)
  * relative links (i.e. [#anchor see this] kind of TracLinks) are now
always referring to the correct resource (#4144)
}}}

cmlenz states:
{{{
Those two are of course related. I believe this can be implemented by
simply passing a resource descriptor to the wiki_to_html() function, and
storing it as a .resource  attribute on the Formatter  object. The link
resolvers can then look at the resource descriptor and fixup the
generating link accordingly.
...
The other thing that a context does is to determine whether links should
be absolute (include the scheme and hostname) or relative to the server
root. This had previously been an attribute on the wiki formatter
object, and I think we should be able to simply move it back there.
}}}

cboos rebuts:
{{{
We can't simply move all the responsibilities of the rendering context
to the Formatter class, as Formatter objects are only present when
explicitly rendering Wiki text, not generally available when rendering
"content", in the mimeview module (the "content" which is rendered is
coming from resources and that content may be wiki content). So we need
something (a RenderingContext class?) that would be made available to
the content renderers, and from there to the formatter, and from the
formatter conveyed to the wiki processors.
}}}

This is where I get lost.  What cmlenz proposes seems to make sense, and
  appears to the be simpler solution.  I'll admit that I haven't really
delved into those parts of the code.  Why can't the formatter be made
available to the mimeview stuff?  If it's so tied to the wiki, why are
we even trying to use it with the mimeview stuff (via Context, etc.)
Does this tie into the Context Factories and Subclasses [see below]?  I
need some schooling here.

cmlenz mentions another feature of the Context class that is similar to
the link rendering:
{{{
In the current code, the Context  class is also responsible for
generating links to a resource. As far as I can tell, this is a separate
enhancement that is only needed for the attachments module and the newly
added "generic reports" feature.
}}}

== 2a. Context Factories and Subclasses ==
[note.  this is much of the stuff where I get kind of lost, so I hope my
summary isn't too far off the mark]

Apparently, the these are used in the current Context stuff to:
{{{
  * return the model object for a context
  * construct a URL to the resource identified by the context
  * provide appropriate "name", "shortname", and "summary" properties
}}}

Currently, the attachment module is the main user of this.  It uses the
Context object to render the name and URL of the parent resource.

cmlenz proposes an IResourceManager interface to handle such things.[4]

On this topic, cboos opines:
{{{
We need some kind of dynamic description of resources, for all the parts
that are generic about resources in Trac: the attachments, the history
and diffs of resource content (#2945), the generic reports, etc.
Those things have been built some time after the WikiContext
integration, so it's true that they are separate enhancements. But I
think that they are very worthwhile ones that have to be kept and
probably expanded upon (generic comments, content change annotation, etc.)
If there's really a need to decouple the way resources can be identified
from the way resources can be described, then we could eventually have
simple Resource objects and "rendering" methods from a manager component
like the one suggested. That would actually make some parts of the
implementation simpler, as we could create very simple Resource objects
which could delegate their dynamic aspects to their managing component.
}}}

I think this is agreeing with cmlenz (more or less).  Cboos does propose
a modified interface for IResourceManager[5]

Now, I kind of like the idea of IResourceManager, but don't fully grasp
the necessity.  Given that we decide a list of tuples is the way to go
for the resource descriptor, doesn't the attachment module already have
all that it needs to generate links to the parent object?

A resource for an attachment would be:
{{{
[('wiki', ('WikiStart', 1)), ('attachment', 'README.txt')]
}}}
so the attachment module knows what it's parent is.

Can someone clarify where we need the whole IResourceManager/resource
description stuff?

= Appendix R - Resource Class =

cboos's solution to osimons's security suggestion is to run all access
to resources through req.perm.  Basically add req.perm.resource() that
would return the resource if permissions allowed, or raise an error (or
return none in the case of req.perm.get_resource())

Example usage:
{{{
page = req.perm.resource('wiki', 'WikiStart')
wiki = req.perm.resource('wiki')
wiki = req.perm.get_resource('wiki') # return None if not authorized
}}}

The proposal is a bit more extensive, but this has already been an epic.

= Conclusion =

Due to the nature of this beast, I'm inclined to say that less is more.
  I do like osimons's security idea, but I agree with cmlenz and aat
that it's out of scope for 0.11.  Cboos's compromise doesn't look bad,
but I think think it's something that would be too rushed for 0.11.
Better leave it out and do it properly for 0.12 or 0.13

I don't fully understand the issues surrounding the whole "resource
description" vs "resource identification" stuff.  I don't quite see the
benefit to the IResourceManager interface.   Why is it that we really
need it in the first place?

I'm definitely for a very simple descriptor/identifier for resources.
It makes security easy and flexible.  Rendering stuff isn't needed there.

So I'd love to be schooled as to all the things that I'm not taking into
consideration, or to where I've completely missed the mark in this
discussion.  I'd also love to see some kind of consensus sooner rather
than later (as I'm sure most people would).  But I don't think that's
going to happen until we really talk about it.

Sorry for the epic novel.

-John

[1] http://groups.google.com/group/trac-dev/msg/e3543733f1c25eec
[2] http://groups.google.com/group/trac-dev/msg/e5a85040770b130e
[3] henceforth referred to as "resource descriptor" regardless of
whether it is more of a simple identifier or not.
[4]
{{{
class IResourceManager(Interface):

     def get_resource_realm():
         """Return the string identifying the realm of resources
         handled by this component."""

     def get_resource_url(descriptor, href, **kwargs):
         """Return the URL for the requested resource."""

     def resolve_resource(descriptor):
         """Return an object representing the content of the
         requested resource.

         The returned object needs to have at least the following
         properties and functions:
         * `__unicode()__`: should return a compact representation
           of the object, such as "#123" for the ticket with the ID
           123
         * `display_name`: a more verbose string representation of
           the object, for example "Ticket #123" for the ticket with
           the ID 123
         """
}}}
[5]
{{{
class IResourceManager(Interface):

     def get_resource_realms():
         """Generate realm strings identifying the realm
         of resources handled by this component."""

     def get_model(resource):
         """Return an object representing the content of
         the requested resource.
         """

     def get_resource_url(resource, href, **kwargs):
         """Return the URL for the requested resource."""

     def get_resource_description(resource, format=None):
         """Return a representation of the resource,
         according to the `format`.

         For example, the ticket with the ID 123 is
         represented like `'#123'` for the `'compact'` format,
         `'Ticket #123'` for the default `None` format.
         With the `'summary'` format, more details about the
         resource will be given, at the expense of a lookup to
         the resource's model.
         """
}}}



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Trac 
Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/trac-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[Trac-dev] Context Refactoring

Reply via email to