Hi Jason, nice proposal. We were discussing it today at Cloudant and
had a few comments:
On Sep 7, 2009, at 6:50 PM, Jason Davies wrote:
Hi all,
There have been sporadic discussions about various granularities of
authorization. The most simple level to tackle is per-db
authorization. What follows is a summary of discussions and ideas
so far.
I should point out that this is primarily to flesh out the default
authorization modules that address the needs of the majority of
users. We probably will have an authorization_handlers settings,
analagous to authentication_handlers, allowing custom authorization
modules to be used.
1. Where are the permission "objects" themselves stored? The
permissions determine which users can do what with each database. I
think storing these in the per-node users database (called "users"
by default) makes the most sense. We are talking about per-db auth
so it wouldn't make any sense to store this information in the
affected databases themselves.
I think it's actually pretty sensible to store some authz information
in the DB itself, for many of the same reasons outlined by Brian and
Benoit. The big exception there is the ability to create new DBs.
That's traditionally the task of a server admin, but perhaps we could
come up with some special role that could be granted to users to allow
them to do that.
2. What types of operations do we need to support? I think the
majority of users will only care about being able to make particular
databases read-only, read/write, or write-only (not sure about the
latter one).
I think write-only is a keeper. It may also be useful to distinguish
between creating new documents and updating existing ones. For
instance, SQL GRANT tables distinguish between INSERT and UPDATE.
On the other hand, our REST interface doesn't have that clean
distinction; PUT and POST can both be used for create and update. And
I agree with Chris; mapping authz to REST verbs is a good smell. In
the rest of the discussion I've assumed that mapping.
3. How do we implement these operations using the existing user_ctx
{name=..., roles=[...]} object? I don't think we necessarily need
to set any special roles, although this was my initial thought e.g.
['_read', '_write'] on a per-db basis. As authorization is a
separate module, we can simply pass the appropriate permission (read
and/or write) through when opening the db internally in the httpd db
handler function. The db-opening function will then need to throw
an error if writes are attempted and it is in read-only mode. Using
actual roles is potentially more elegant, as custom roles could also
be set using the permission objects and implementation might be
easier.
+1 for adding elements to the roles array in the #user_ctx. More on
this in 5.1
4. One use-case we need to bear in mind is being able to grant/deny
access to sets of databases at a time. One way to do this would be
to allow patterns to be specified, for example:
{
"_id": "foo",
"type": "permission",
"username": "jason"
"match": "jason/*",
"operations": ["_read"]
}
This would grant the user "jason" read-only access to any database
that has the prefix "jason/".
5. Permissions per roles vs permissions per users? Although the
above example specifies access for a particular user, it might be
more elegant and efficient to do this per role instead. If per user
is needed this can be done by giving the user a special role unique
to them. If a user has multiple roles then we would take the union
of the resulting permission set.
+1 for roles here. I think it makes sense for the users DB to define
roles for each user, either by adding roles to a user document or
users to a role document (or both). But the actual specification of
privileges for a role in a given DB should go in the DB.
I realize this doesn't allow for easy configuration of privileges
across multiple DBs.
5. Default settings: we already have the require_valid_user setting,
which forces a node to authenticate users. We would need to support
certain access permissions for non-logged-in users i.e. anonymous
users. This could be done using a special "_anonymous" string in
the permission to override the default, which would probably be read/
write for everyone as it is now.
6. Future work: thisfred suggested that the pattern-matching could
be extended to the full URL instead of just the database name. This
seems like a simple way to extend authorization. Of course, it's
dependent on a particular node's URL mappings (these can be changed
in the .ini). This then brings up the question of what the
operations should be, it would make the most sense to let them be
HTTP verbs, so that one could restrict access to certain URLs to
being only GET and HEAD for example. This seems a bit too tied to
HTTP for my liking, but I guess CouchDB is very much a RESTful and
therefore HTTP-reliant database. Any further ideas would be welcomed.
So, after giving this some thought I'm partial to the idea of Access
Control Lists. Instead of directly granting privileges on databases in
the users DB, we'd store an ordered list in the DB in a special
document that would allow|deny requests that match a rule. For
instance, if I wanted to make a read-only DB where only I could access
the _design documents I could upload a document like
{
_id: "_authorization",
_rev: "1-1340514305943",
_acl: [
{"access":"allow", "role":"kocolosk", "method":"*",
"path":"*"},
{"access":"deny", "role":"*", "method":"*",
"path":"_design*"}
{"access":"allow", "role":"*", "method":"GET",
"path":"*"},
{"access":"deny", "role":"*", "method":"*",
"path":"*"}
]
}
The rules in the ACL array are applied in order, and the first rule to
match wins. Here I've assumed that my user has a corresponding role,
like a UNIX group.
I explicitly listed the deny rule at the end, but we could make that
the default if we wished. CouchDB has historically been pretty open,
but sysadmins would probably prefer it if things were secure out-of-
the-box. I think the right default setting will become clear during
the implementation.
Benoit mentioned that he wanted authz to replicate. If we decide
that's the way we want to go, storing the ACL in a regular document
with a reserved ID would allow for that. If we didn't want it to
replicate, we could just change that docid to something like _local/
authorization
We might take this one step further and allow additional Access
Control Elements in individual documents. These ACEs would be
prepended to the DB ACL and would allow you to specify custom authz
for a subset of documents in a DB without having to resort to path-
based regex and editing the DB ACL every time.
Finally, there's the issue of authz in views. What privileges does
the view indexer have? If a user who is only allowed to read some of
the documents in the DB is allowed to upload a _design document, it
seems to me that the views generated from that _design document must
exclude any forbidden documents. I guess this can work if the _design
doc stores the roles of the user who saved it. It seems like a
tricky, but solvable problem.
Best, Adam