I just checked in the first round of refactoring of front end HTTPd code. Previously, nearly 100% of the CouchDB httpd was in couch_httpd.erl, now it's been divided up to couch_httpd_db.erl, couch_httpd_view.erl and couch_httpd_misc_handlers.erl.

The couch_httpd module now mostly provides a wrapper around mochiweb and implements dispatch facility for finding the correct module to handle incoming HTTP requests.

On startup, CouchDB reads the ini config files to figure out what module and function, if any, should be invoked in response to special HTTP request.

Here is an example of a default.ini read at startup:
[httpd_global_handlers]
/ = {couch_httpd_misc_handlers, handle_welcome_req, <<"Welcome">>}
_utils = {couch_httpd_misc_handlers, handle_utils_dir_req, "../../ share/www"}
_all_dbs = {couch_httpd_misc_handlers, handle_all_dbs_req}
_config = {couch_httpd_misc_handlers, handle_config_req}
_replicate = {couch_httpd_misc_handlers, handle_replicate_req}
_uuids = {couch_httpd_misc_handlers, handle_uuids_req}
_restart = {couch_httpd_misc_handlers, handle_restart_req}

[httpd_db_handlers]
_view = {couch_httpd_view, handle_view_req}
_temp_view = {couch_httpd_view, handle_temp_view_req}

[daemons]
view_manager={couch_view, start_link, []}
db_update_notifier={couch_db_update_notifier_sup, start_link, []}
full_text_query={couch_ft_query, start_link, []}
query_servers={couch_query_servers, start_link, []}
httpd={couch_httpd, start_link, []}

/ end

The [httpd_global_handlers] are the modules and function names that get invoked for special urls (plus an optional third argument) After reading the ini key/values into a dictionary in memory, every request URL that comes in is parsed to see if the first URL path segment matches a special key. For example, for a request like "GET /_utils/ images/image.gif", CouchDB will parse the url to get the "_utils" part, then find a matching "_utils" handler in the handler dictionary, then invoke the handler with the couch_http request object.

If there is no matching httpd_global_handler, then CouchDB hands the request off the the couch_httpd_db module where it might invoke a [httpd_db_handlers] for it. The couch_httpd_db module firsts looks at the second URL path segment (Example: In "GET /db/_view/foo", the "_view" is the second path segment) and If it finds an db handler for it, then it open the database and invokes the handler with the HTTP request and database passed in as the context. But if no handler matches, the couch_httpd_db module attempts to serve the request itself (including some special urls, like _all_docs, and _compact).

This will allow for custom CouchDB database extensions. A simple example that's currently disabled by default is couch_httpd_misc_handlers:increment_update_seq_req/2. It purpose is to allow a client to increment the database update seq# and have it returned the client. This was needed by someone using CouchDB as an IMAP storage backend, but probably isn't generally useful. Therefore, anyone who wants to can enable this extension by adding this to their local.ini file:

[httpd_db_handlers]
_increment_update_seq = {couch_httpd_misc_handlers, increment_update_seq_req}

Once enabled, whenever a client does a "POST /db/ _increment_update_seq", it will invoke the handler.

The handlers can have a 3rd argument, which will always be passed to the handler as an extra arg, which must be a valid erlang term. In the main example, we pass the welcome message as an argument like this:
/ = {couch_httpd_misc_handlers, handle_welcome_req, <<"Welcome">>}

The daemon support causes CouchDB to load up a new OTP server processes. You provide a name as the key and the module, function and start arguments as the value, and CouchDB will attempt to load the modules and start the subprocesses. If the sub-processes crash, CouchDB will restart them just as any other OTP server process. And these OTP process can also spawn external child OS processes if necessary.

By combining daemons and and new HTTP handlers, it is possible to create new CouchDB services, like a full text search engine written in Erlang. The search engine daemon will keep the indexes up to date, and the http handlers will process incoming requests and query the indexes, likely by interacting with the daemon.

I think we might still need to provide a way for CouchDB to find 3rd party extension modules, I'm thinking that should probably be an ini setting, with multiple directories for erlang to search for modules.

Feed back welcome. Remember, nothing is set in stone, and much still can be done to further organize the code. Fire away.

-Damien

Reply via email to