On Feb 27, 2010, at 10:32 AM, Filipe David Manana wrote: > Dear devs, > > Currently, the URI handler for /_all_dbs just lists, recursively, all the db > files in the database dir (parameter database_dir of the .ini file). > > Since he have now a _security object per DB (I dunno why it's not a regular > doc) which allows to restrict access to each DB, that code is no longer > fair. It makes sense that this handler just returns a list of the DBs an > user has access to. > > It's through this URI that for example Futon lists the available DBs. > > There's a ticket for this: https://issues.apache.org/jira/browse/COUCHDB-661 > > That solution is acceptable if the number of DBs in the server is "just" up > to about 10 000 or so. I tested with 7500 DBs, each occupying about 1Mb and > having 100 docs, and the response time for _all_dbs was about 4 seconds > (more details in the comments of that ticket). > > The problem is that for each DB file found, one has to read its header and > then read its _security object to figure out if the session user can access > that DB. Therefore, we have 2 disk read operations for each DB file. 1 > million DBs would imply 2 million disk reads. > > Obviously an efficient solution for this would be to have a view which maps > users to DBs. I have an incomplete idea for this. > What I thought about is the following: > > 1) Having a special db, named "_dbs" (for example) which would contain meta > information about every available DB (like the meta tables in Oracle, SQL > Server, and so on). > > 2) That DB would contain a doc for each available DB. Each doc would contain > the reader names and roles associated to the corresponding DB (this is the > only kind of info we need for _all_dbs) > > 3) We would have a view, like Brian Candler suggested in a comment to that > ticket, that emits keys like: > emit(['name',name],db) > emit(['role',role],db) > > 4) For DBs with a _security object having empty lists for both the reader > names and reader roles, we would emit the special role "_public" for example > > 5) Whenever the _security object of a DB is updated, we would update the > corresponding reader names and roles in the _dbs DB. >
this is the best reason I've heard for making it a security document. I wonder how much slower the 7.5k dbs scan proceeds when it has to look up documents instead of linked objects? do you mind adding a doc-read to the tight loop just to see what it does to performance? the 7.5k thing isn't important once we have a _dbs db, but the cost it will expose as a benchmark will be proportional to the cost incurred on opening any db for any operation, and thus significant. > I though of some issues (for which I don't have a solution) : > > 1) If a user just copies DB files from elsewhere (another server or a > backup for e.g.) into the DBs directory, how do we detect them? Scanning for > all DB files at startup and taking proper action would be potentially slow. > Also, if a DB file is copied while CouchDB is running, I dunno how to detect > it. The only idea I have now is: Every time a DB file is opened (due to a > user request), we check if _dbs has a corresponding entry and if not we take > proper action > > 2) If a user deletes a DB file manually (i.e. rm db_file.couch), how to > detect it and remove the corresponding entry in _dbs? > > 3) If a user restores a DB file backup containing an old _security object, > we need to detect that and update the entry in _dbs. A way to do this would > be to store the DB update seq number in the corresponding doc at _dbs and > then using the same idea as in 1) > > These are very preliminary ideas. > > I would like to collect suggestions from all of you on how to implement this > efficiently and know if you can point out any other problems I haven't > thought about. > > thanks > > best regards, > > -- > Filipe David Manana, > [email protected] > PGP key - http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xC569452B > > "Reasonable men adapt themselves to the world. > Unreasonable men adapt the world to themselves. > That's why all progress depends on unreasonable men."
