Hi all,
Thanks for all the excellent responses!
With Chris Anderson's "simplest thing that could possibly work" idea
in mind, here's a quick summary of what I plan to implement as a first
cut. I've taken ideas from multiple responses on this thread, so I
wasn't sure which message to reply to, but this plan is mostly
inspired by Adam's ACL ideas so I've included that message below this
one for reference.
The simplest idea is that we have a special doc in each database,
"_local/_acl" or similar, containing a list of [role(s), "read" or
"write"] pairs. By default everything is denied to everyone (except
_admin). The most common use case would be to then have ["username",
"read"] and ["username", "write"] to give a user read and write
permissions to that particular database. (In this example, I've
assumed that in the _users database we map the "username" user to the
"username" role for simplicity). If we want to give particular access
(e.g. read-only) to *everyone*, we can use the special "*" string to
denote a wildcard, which matches any role, including no role at all
e.g. ["*", "read"].
I envisage this default "deny all" behaviour being a switch in
the .ini file, so people will only turn it on once they have users and/
or ACLs set up.
OK, so that is the simplest implementation. Possible extensions would
be to make this more rule-like (c.f. Apache) and have ["grant" or
"deny", role(s), "read" or "write"]. This would let you do more
complex setups, although I'm not sure this is necessary unless we
introduce more advanced pattern matching. validate_doc_update already
lets us do things like denying new docs from being created, etc, so no
need to concern ourselves with that level of write granularity (i.e.
we only need a "write" permission, no need for "create" or "update").
Pattern matching is a possible extension too, letting us grant/deny
certain URL paths instead of the whole db. I'm not convinced tying
ourselves to HTTP verbs is a good idea, it could be that read/write is
sufficient for us if we pass this to couch_db:open and throw an
exception if any handlers attempt to write when they have opened the
db in read-only mode.
Another extension would be to have per-doc ACLs, but I think we still
need more discussion about this on the mailing list. It could
potentially look like having a similar ACL structure for each doc in a
special _acl member. We can potentially push this information into
the fulldoc B-tree (instead of looking it up in a separate view) to
save having to do two view lookups when performing authz.
One decision that we might want to achieve more consensus on is about
where the ACL itself is stored - either in the db that it applies to,
or in the users db. Several on the mailing list thread have expressed
that it's better to keep the ACL with the db that it applies to, which
seems to make the most sense to me (yes, I've switched my thinking on
this): for example if you delete a db then its ACL doc automatically
gets deleted. As it is a local doc it won't get replicated anyway.
This seems more intuitive as it's more similar to how UNIX permissions
on a directory work.
My thoughts are that I should get down and implement this, as even if
we change our minds about some things, like where the ACL gets stored,
the majority of the code will be the same and trying it out as code
may give us valuable feedback about other design choices.
Thanks,
--
Jason Davies
www.jasondavies.com
On 8 Sep 2009, at 23:41, Adam Kocoloski wrote:
Hi Jason, nice proposal. We were discussing it today at Cloudant
and had a few comments:
On Sep 7, 2009, at 6:50 PM, Jason Davies wrote:
Hi all,
There have been sporadic discussions about various granularities of
authorization. The most simple level to tackle is per-db
authorization. What follows is a summary of discussions and ideas
so far.
I should point out that this is primarily to flesh out the default
authorization modules that address the needs of the majority of
users. We probably will have an authorization_handlers settings,
analagous to authentication_handlers, allowing custom authorization
modules to be used.
1. Where are the permission "objects" themselves stored? The
permissions determine which users can do what with each database.
I think storing these in the per-node users database (called
"users" by default) makes the most sense. We are talking about per-
db auth so it wouldn't make any sense to store this information in
the affected databases themselves.
I think it's actually pretty sensible to store some authz
information in the DB itself, for many of the same reasons outlined
by Brian and Benoit. The big exception there is the ability to
create new DBs. That's traditionally the task of a server admin,
but perhaps we could come up with some special role that could be
granted to users to allow them to do that.
2. What types of operations do we need to support? I think the
majority of users will only care about being able to make
particular databases read-only, read/write, or write-only (not sure
about the latter one).
I think write-only is a keeper. It may also be useful to
distinguish between creating new documents and updating existing
ones. For instance, SQL GRANT tables distinguish between INSERT and
UPDATE.
On the other hand, our REST interface doesn't have that clean
distinction; PUT and POST can both be used for create and update.
And I agree with Chris; mapping authz to REST verbs is a good
smell. In the rest of the discussion I've assumed that mapping.
3. How do we implement these operations using the existing user_ctx
{name=..., roles=[...]} object? I don't think we necessarily need
to set any special roles, although this was my initial thought e.g.
['_read', '_write'] on a per-db basis. As authorization is a
separate module, we can simply pass the appropriate permission
(read and/or write) through when opening the db internally in the
httpd db handler function. The db-opening function will then need
to throw an error if writes are attempted and it is in read-only
mode. Using actual roles is potentially more elegant, as custom
roles could also be set using the permission objects and
implementation might be easier.
+1 for adding elements to the roles array in the #user_ctx. More on
this in 5.1
4. One use-case we need to bear in mind is being able to grant/deny
access to sets of databases at a time. One way to do this would be
to allow patterns to be specified, for example:
{
"_id": "foo",
"type": "permission",
"username": "jason"
"match": "jason/*",
"operations": ["_read"]
}
This would grant the user "jason" read-only access to any database
that has the prefix "jason/".
5. Permissions per roles vs permissions per users? Although the
above example specifies access for a particular user, it might be
more elegant and efficient to do this per role instead. If per
user is needed this can be done by giving the user a special role
unique to them. If a user has multiple roles then we would take
the union of the resulting permission set.
+1 for roles here. I think it makes sense for the users DB to
define roles for each user, either by adding roles to a user
document or users to a role document (or both). But the actual
specification of privileges for a role in a given DB should go in
the DB.
I realize this doesn't allow for easy configuration of privileges
across multiple DBs.
5. Default settings: we already have the require_valid_user
setting, which forces a node to authenticate users. We would need
to support certain access permissions for non-logged-in users i.e.
anonymous users. This could be done using a special "_anonymous"
string in the permission to override the default, which would
probably be read/write for everyone as it is now.
6. Future work: thisfred suggested that the pattern-matching could
be extended to the full URL instead of just the database name.
This seems like a simple way to extend authorization. Of course,
it's dependent on a particular node's URL mappings (these can be
changed in the .ini). This then brings up the question of what the
operations should be, it would make the most sense to let them be
HTTP verbs, so that one could restrict access to certain URLs to
being only GET and HEAD for example. This seems a bit too tied to
HTTP for my liking, but I guess CouchDB is very much a RESTful and
therefore HTTP-reliant database. Any further ideas would be
welcomed.
So, after giving this some thought I'm partial to the idea of Access
Control Lists. Instead of directly granting privileges on databases
in the users DB, we'd store an ordered list in the DB in a special
document that would allow|deny requests that match a rule. For
instance, if I wanted to make a read-only DB where only I could
access the _design documents I could upload a document like
{
_id: "_authorization",
_rev: "1-1340514305943",
_acl: [
{"access":"allow", "role":"kocolosk", "method":"*",
"path":"*"},
{"access":"deny", "role":"*", "method":"*",
"path":"_design*"}
{"access":"allow", "role":"*", "method":"GET",
"path":"*"},
{"access":"deny", "role":"*", "method":"*",
"path":"*"}
]
}
The rules in the ACL array are applied in order, and the first rule
to match wins. Here I've assumed that my user has a corresponding
role, like a UNIX group.
I explicitly listed the deny rule at the end, but we could make that
the default if we wished. CouchDB has historically been pretty
open, but sysadmins would probably prefer it if things were secure
out-of-the-box. I think the right default setting will become clear
during the implementation.
Benoit mentioned that he wanted authz to replicate. If we decide
that's the way we want to go, storing the ACL in a regular document
with a reserved ID would allow for that. If we didn't want it to
replicate, we could just change that docid to something like _local/
authorization
We might take this one step further and allow additional Access
Control Elements in individual documents. These ACEs would be
prepended to the DB ACL and would allow you to specify custom authz
for a subset of documents in a DB without having to resort to path-
based regex and editing the DB ACL every time.
Finally, there's the issue of authz in views. What privileges does
the view indexer have? If a user who is only allowed to read some
of the documents in the DB is allowed to upload a _design document,
it seems to me that the views generated from that _design document
must exclude any forbidden documents. I guess this can work if the
_design doc stores the roles of the user who saved it. It seems
like a tricky, but solvable problem.
Best, Adam