Re: Question about validator functions and replication

Patrick Barnes Sat, 26 Mar 2011 05:44:01 -0700

> 1) Alice and Bob are friends, and Alice lent her laptop to Bob so that
> he could do some offline blogging too.

If 'Bob' wants to do some blogging, he *really* should be required toinput his own credentials to push any of _his_ modifications back to thecentral server.

As I understand it, you have a central server 'C' with a couchdbdatabase, and users 'A' and 'B', each of whom have a local copy of C'sdatabase.


I think:

* When 'A' or 'B' push back to the central db 'C', the replication ruleshould only accept those changes where the author is the authenticatinguser.('C' should also allow a user with the special 'replication' role toreplicate all documents; for backup/restore, load balancing, etc)

* When 'A' or 'B' pull from the central db 'C', the replication ruleshould accept all document changes, because the central db is thetrusted one.

So if Alice or Bob do anything funny to their db, it won't be able toadversely affect the central db.

Q: If 'A', 'B' and 'C' are replicating amongst themselves, the designdoc will need to be the same on each... so how do you have the differentbehaviour described above?A: Users can have a different set of roles on different servers. So on'C', Alice is probably an ordinary user, but on 'A' Alice can have adminpermissions - the design doc on the target db can check the user rolesand differentiate.

If you have some secure data floating around in your db, encryption canstill be a good idea. Parts of one of my databases are encrypted,because users need to be able to see some documents in the db but notothers.

Finally, you can also look at filtered replication, so - *as well as*restricting which documents can be replicated back to the central db 'C'by whom - you can limit replication pushes to only those docs thatshould pass validation.


Hope that helps,
-Patrick



On 26/03/2011 2:37 AM, Nebu Pookins wrote:

Thanks Patrick,

my follow-up question is: What if Sever 2 can't "trust" Server 1? For
example, Server 1 is actually an "offline" version of the blogging
platform downloaded onto Alice's laptop so that she may continue
blogging while offline, and then, when she comes online, her work is
synced back with the cloud.

If Server 1 claims both Alice's and Bob's posts have been modified, it
seems like Server 2 would be unable to distinguish between these two
scenarios:

1) Alice and Bob are friends, and Alice lent her laptop to Bob so that
he could do some offline blogging too.
2) Alice is maliciously forging blog posts under Bob's name. (She
might not even actually be running the CouchApp locally, and is merely
sending carefully crafted HTTP requests via curl to simulate
replication).

The two partial solutions that immediately come to mind are:

1) When a user downloads their own instance of the CouchApp, it's
"tied" to that user. Bob cannot use Alice's copy of the CouchApp to
blog; but if Alice had the foresight to download a copy both under her
account and under Bob's account, then they can still share a laptop
for offline access, although in different copies of the app.
2) Use public key cryptography and signatures, such that every change
is signed, and every server can verify the signature of every
modification to see if it really came from who it should be coming
from.

Solution (1) is unpleasant for the end user experience, but otherwise
seems straightforward to implement.
Solution (2) seems much more difficult to implement (JavaScript's
treating of all numerics as floating points, whereas Cryptography
typically works entirely with very big (e.g. 128bit, 256 bit, 4096
bit, etc.) integers makes the two look like a mismatch without first
writing a Bigint library), and while in theory seems to be much more
pleasant for end users, in practice I believe the user would not only
need to provide a password, but also their private key renders the
user experience unpleasant again.

I'll probably end up going with solution (2) because the proposed app
we're building will have to deal with banking and financial data, so
we're going to have to have some cryptography in there anyway, but I'm
just making sure I'm not missing anything (such as CouchDB already
having some sort of encryption library built in).

Thanks again, Patrick, for clearing things up for me.

- Nebu

On Thu, Mar 24, 2011 at 7:01 PM, Patrick Barnes<[email protected]>  wrote:

You're quite right, it would fail on replication. (And if you identified
that issue solely from a hypothetical standpoint, well done)

The solution could be to give your replication user as 'replication' role,
and have the validator function look something like:

function(newDoc, oldDoc, userCtx) {
        function to_object(arr) { obj = {}; for(var k in arr) obj[arr[k]] =
true; return obj; }
if (newDoc.author) {
  if(newDoc.author != userCtx.name and !('replication' in
to_object(userCtx.roles))) {
        throw("forbidden": "You may only update documents with author " +
userCtx.name});
  }
}

-Patrick

On 25/03/2011 4:46 AM, Nebu Pookins wrote:


Hi,

I'm reading "CouchDB The Definitive guide", and in the chapter on
"Security" (http://guide.couchdb.org/editions/1/en/security.html),
they give an example of how to limit write-access to certain documents
based on its owner. The example validator function they give is:

function(newDoc, oldDoc, userCtx) {
   if (newDoc.author) {
     if(newDoc.author != userCtx.name) {
       throw("forbidden": "You may only update documents with author " +
         userCtx.name});
     }
   }
}

If I understand correctly, userCtx is based on the HTTP request of the
POST/PUT/DELETE request which is trying to modify some document: If
I'm logged into couch, either via HTTP basic authentication, or
cookies, or something along those lines, then my username will show up
in the userCtx, and we simply do a string comparison to see if I'm the
"author" of a given doc, and if so, then the business rule is that I
should be allowed to change the doc.

Elsewhere in the documentation, it mentions that validator functions
are run not only when POST/PUT/DELETE requests are made, but also when
replication occurs. What I'm confused about is what the value of
userCtx would be during replication. To give a more concrete example:

Let's say we have 2 couchDB servers running, called Server 1 and
Server 2, and they've replicated with each other so that they both
contain identical data: a set of blog posts.

A user "Alice" logs onto server 1, and edits one of her blog posts.
The validator function runs, and given that it's Alice that's logged
on, the validator function checks that the blog post's "author" field
is Alice, and assuming it is, it allows the update to occur.
A user "Bob" also logs onto the same server, edits one of his blog
posts, and again the validator allows it.
Then both users log off, and go do something else (e.g. watch a movie,
read a book, etc.)

Now replication occurs: Server 2 will ask server 1 for a list of
changes, and server 1 will report that two blog posts have been
changed.

Given that neither Alice nor Bob are connecting to server 2, it would
seem that the userCtx variable would not contain either of their
names, and thus the validation would reject the change, and
replication would fail.

i figure I must be misunderstanding something about how either
validation or replication works, but I can't seem to figure out what
from the documentation. Can someone help clarify this for me?

Thanks,
Nebu

Re: Question about validator functions and replication

Reply via email to