On 14 Sep 2009, at 22:09, Jason Davies wrote:
Hi all,
Thanks for all the excellent responses!
With Chris Anderson's "simplest thing that could possibly work" idea
in mind, here's a quick summary of what I plan to implement as a
first cut. I've taken ideas from multiple responses on this thread,
so I wasn't sure which message to reply to, but this plan is mostly
inspired by Adam's ACL ideas so I've included that message below
this one for reference.
The simplest idea is that we have a special doc in each database,
"_local/_acl" or similar, containing a list of [role(s), "read" or
"write"] pairs. By default everything is denied to everyone (except
_admin). The most common use case would be to then have
["username", "read"] and ["username", "write"] to give a user read
and write permissions to that particular database. (In this
example, I've assumed that in the _users database we map the
"username" user to the "username" role for simplicity). If we want
to give particular access (e.g. read-only) to *everyone*, we can use
the special "*" string to denote a wildcard, which matches any role,
including no role at all e.g. ["*", "read"].
I envisage this default "deny all" behaviour being a switch in
the .ini file, so people will only turn it on once they have users
and/or ACLs set up.
OK, so that is the simplest implementation. Possible extensions
would be to make this more rule-like (c.f. Apache) and have ["grant"
or "deny", role(s), "read" or "write"]. This would let you do more
complex setups, although I'm not sure this is necessary unless we
introduce more advanced pattern matching. validate_doc_update
already lets us do things like denying new docs from being created,
etc, so no need to concern ourselves with that level of write
granularity (i.e. we only need a "write" permission, no need for
"create" or "update"). Pattern matching is a possible extension
too, letting us grant/deny certain URL paths instead of the whole
db. I'm not convinced tying ourselves to HTTP verbs is a good idea,
it could be that read/write is sufficient for us if we pass this to
couch_db:open and throw an exception if any handlers attempt to
write when they have opened the db in read-only mode.
Another extension would be to have per-doc ACLs, but I think we
still need more discussion about this on the mailing list. It could
potentially look like having a similar ACL structure for each doc in
a special _acl member. We can potentially push this information
into the fulldoc B-tree (instead of looking it up in a separate
view) to save having to do two view lookups when performing authz.
One decision that we might want to achieve more consensus on is
about where the ACL itself is stored - either in the db that it
applies to, or in the users db. Several on the mailing list thread
have expressed that it's better to keep the ACL with the db that it
applies to, which seems to make the most sense to me (yes, I've
switched my thinking on this): for example if you delete a db then
its ACL doc automatically gets deleted. As it is a local doc it
won't get replicated anyway. This seems more intuitive as it's more
similar to how UNIX permissions on a directory work.
My thoughts are that I should get down and implement this, as even
if we change our minds about some things, like where the ACL gets
stored, the majority of the code will be the same and trying it out
as code may give us valuable feedback about other design choices.
+1 one going ahead and working on a simple, yet extensible solution
with minimal role support in the _users DB.
Cheers
Jan
--
Thanks,
--
Jason Davies
www.jasondavies.com
On 8 Sep 2009, at 23:41, Adam Kocoloski wrote:
Hi Jason, nice proposal. We were discussing it today at Cloudant
and had a few comments:
On Sep 7, 2009, at 6:50 PM, Jason Davies wrote:
Hi all,
There have been sporadic discussions about various granularities
of authorization. The most simple level to tackle is per-db
authorization. What follows is a summary of discussions and ideas
so far.
I should point out that this is primarily to flesh out the default
authorization modules that address the needs of the majority of
users. We probably will have an authorization_handlers settings,
analagous to authentication_handlers, allowing custom
authorization modules to be used.
1. Where are the permission "objects" themselves stored? The
permissions determine which users can do what with each database.
I think storing these in the per-node users database (called
"users" by default) makes the most sense. We are talking about
per-db auth so it wouldn't make any sense to store this
information in the affected databases themselves.
I think it's actually pretty sensible to store some authz
information in the DB itself, for many of the same reasons outlined
by Brian and Benoit. The big exception there is the ability to
create new DBs. That's traditionally the task of a server admin,
but perhaps we could come up with some special role that could be
granted to users to allow them to do that.
2. What types of operations do we need to support? I think the
majority of users will only care about being able to make
particular databases read-only, read/write, or write-only (not
sure about the latter one).
I think write-only is a keeper. It may also be useful to
distinguish between creating new documents and updating existing
ones. For instance, SQL GRANT tables distinguish between INSERT
and UPDATE.
On the other hand, our REST interface doesn't have that clean
distinction; PUT and POST can both be used for create and update.
And I agree with Chris; mapping authz to REST verbs is a good
smell. In the rest of the discussion I've assumed that mapping.
3. How do we implement these operations using the existing user_ctx
{name=..., roles=[...]} object? I don't think we necessarily need
to set any special roles, although this was my initial thought
e.g. ['_read', '_write'] on a per-db basis. As authorization is a
separate module, we can simply pass the appropriate permission
(read and/or write) through when opening the db internally in the
httpd db handler function. The db-opening function will then need
to throw an error if writes are attempted and it is in read-only
mode. Using actual roles is potentially more elegant, as custom
roles could also be set using the permission objects and
implementation might be easier.
+1 for adding elements to the roles array in the #user_ctx. More
on this in 5.1
4. One use-case we need to bear in mind is being able to grant/
deny access to sets of databases at a time. One way to do this
would be to allow patterns to be specified, for example:
{
"_id": "foo",
"type": "permission",
"username": "jason"
"match": "jason/*",
"operations": ["_read"]
}
This would grant the user "jason" read-only access to any database
that has the prefix "jason/".
5. Permissions per roles vs permissions per users? Although the
above example specifies access for a particular user, it might be
more elegant and efficient to do this per role instead. If per
user is needed this can be done by giving the user a special role
unique to them. If a user has multiple roles then we would take
the union of the resulting permission set.
+1 for roles here. I think it makes sense for the users DB to
define roles for each user, either by adding roles to a user
document or users to a role document (or both). But the actual
specification of privileges for a role in a given DB should go in
the DB.
I realize this doesn't allow for easy configuration of privileges
across multiple DBs.
5. Default settings: we already have the require_valid_user
setting, which forces a node to authenticate users. We would need
to support certain access permissions for non-logged-in users i.e.
anonymous users. This could be done using a special "_anonymous"
string in the permission to override the default, which would
probably be read/write for everyone as it is now.
6. Future work: thisfred suggested that the pattern-matching could
be extended to the full URL instead of just the database name.
This seems like a simple way to extend authorization. Of course,
it's dependent on a particular node's URL mappings (these can be
changed in the .ini). This then brings up the question of what
the operations should be, it would make the most sense to let them
be HTTP verbs, so that one could restrict access to certain URLs
to being only GET and HEAD for example. This seems a bit too tied
to HTTP for my liking, but I guess CouchDB is very much a RESTful
and therefore HTTP-reliant database. Any further ideas would be
welcomed.
So, after giving this some thought I'm partial to the idea of
Access Control Lists. Instead of directly granting privileges on
databases in the users DB, we'd store an ordered list in the DB in
a special document that would allow|deny requests that match a
rule. For instance, if I wanted to make a read-only DB where only
I could access the _design documents I could upload a document like
{
_id: "_authorization",
_rev: "1-1340514305943",
_acl: [
{"access":"allow", "role":"kocolosk", "method":"*",
"path":"*"},
{"access":"deny", "role":"*", "method":"*",
"path":"_design*"}
{"access":"allow", "role":"*", "method":"GET",
"path":"*"},
{"access":"deny", "role":"*", "method":"*",
"path":"*"}
]
}
The rules in the ACL array are applied in order, and the first rule
to match wins. Here I've assumed that my user has a corresponding
role, like a UNIX group.
I explicitly listed the deny rule at the end, but we could make
that the default if we wished. CouchDB has historically been
pretty open, but sysadmins would probably prefer it if things were
secure out-of-the-box. I think the right default setting will
become clear during the implementation.
Benoit mentioned that he wanted authz to replicate. If we decide
that's the way we want to go, storing the ACL in a regular document
with a reserved ID would allow for that. If we didn't want it to
replicate, we could just change that docid to something like _local/
authorization
We might take this one step further and allow additional Access
Control Elements in individual documents. These ACEs would be
prepended to the DB ACL and would allow you to specify custom authz
for a subset of documents in a DB without having to resort to path-
based regex and editing the DB ACL every time.
Finally, there's the issue of authz in views. What privileges does
the view indexer have? If a user who is only allowed to read some
of the documents in the DB is allowed to upload a _design document,
it seems to me that the views generated from that _design document
must exclude any forbidden documents. I guess this can work if the
_design doc stores the roles of the user who saved it. It seems
like a tricky, but solvable problem.
Best, Adam