On 29/03/2008, Johan Sørensen <[EMAIL PROTECTED]> wrote: > I am however curious why I don't get the document id sent to the > notifier daemon/client on a database update? Because it's not > available on the update_loop function?
I expected to be sent the document id at first but then I realised that sending the database name and leaving the indexing process to track indexing progress means the whole thing is less coupled and copes with indexer failure better. Less coupled, because the indexer is free to choose how much of the database it indexes each time (although hopefully it will only process the changes). Possibly a contrived example, but perhaps the indexer only processes changes once per hour. Copes better with indexer failure, because if the indexer crashes then it can be fixed in place & restarted and the indexer can pick up from it's last know good state. If the indexer is written to be idempotent then it doesn't even matter if it processes some changes twice. Anyway, that's how I've been doing it ;-). My basic process is: 1. CouchDB tells my indexer a database has been changed. 2. I load the last know state (the 'key' recorded in step 6). 3. Call /<dbname>/_all_docs_by_seq with a startkey of the key (if any) and a count of 100 (for instance). 4. Process the batch of changes returned from the above request. 5. Repeat steps 3 & 4 until there's nothing left to process. 6. Record the key of the last change processed. 7. Go back to waiting for CouchDB to send another database name. Of course, it's quite possible to parallelise a couple of parts of the above process. - Matt
