Hi, Alexander. Let's make a new thread on the users list for posterity or perhaps participation from the community.
## Back story https://github.com/iriscouch/audit-couchdb/ http://code.google.com/p/couchdb-auditor/ Alexander rewrote Audit CouchDB in Python. We have an interesting discussion about how to maintain parity between both tools. Whether you choose Python version, or the NodeJS version, you should be confident that it will identify all known issues. We could manually follow each other's code changes, and manually sync up vulnerability "signatures" but that will probably never work. I am rusty in Python and you are wise enough to dodge the Node.js fad, so I think a good thing is to make it easy for back-and-forth exchange of known vulnerabilities (or tests, or alerts, whatever the word is). I thought of a shared "vulnerability definitions" data set: a big list of JSON definitions. "If the database admins include a user name which does not exist in the users database, warn me at severity=high." However, that seems useless. For example, how could you represent (in JSON) that rule above? How do you know the JSON format can describe a new test for a vulnerability nobody has though of yet? It would quickly become a poorly-defined, buggy programming language written in JSON. audit-couchdb does not have good unit tests either. Now that there are ports to new languages, it is obvious that this work is overdue. So, now I am thinking (still a half-baked) of something that *does* fit well as JSON data: bad CouchDB configs, combined with the expected warnings a good auditor should produce. Consider a mock CouchDB client library, with the same API as usual (couchdb-python, request, etc.) but it returns hard-coded responses. We could test our code without even running a CouchDB server. What response should the mock library return? Well, it should simulate a CouchDB server that has a problem. For example, a couch with no users, but a database has an admin user defined: { "_users": {} , "bad_admin_name": { "_security": {"admins":{"names":["bob"]}}} } The idea is, given the above config, CouchDB would respond like this: GET /_all_dbs -> ["_users", "bad_admin_name"] GET /_users/_all_docs/ -> {"total_rows":0,"offset":0,"rows":[]} GET /bad_admin_name/_security -> {"admins":{"names":["bob"]}} At this point, both your tool and my tool should notice the problem. So, now I am thinking about a corpus, or a body, or a collection (you might even say a "database") with a list of bad responses, and expected auditor reactions. { "_id": "Database admin usernames with no corresponding user document" , "couch": { "_users": {}, "bad_admin_name": { "_security": {"admins":{"names":["bob"]}}}} , "severity": "high" } { "_id": "Detect Admin Party" , "couch": {"_session": {"ok":true,"userCtx":{"name":null,"roles":["_admin"]}} , "severity": "medium" } Basically, each object is one test in the test suite. We can keep a file or set of files synchronized, perhaps eventually storing them in a Couch database and building the tests from the latest data set. -- Iris Couch
