That is awesome, Alex! This is the thing I always dream when our document schemes receives new update and there is need to apply they for large amount of databases. Trick with local database mirror and replications works, but it's not very fast.
Few questions: 1. What happens it normalization fails due to some reasons? For example, due to conflict update. Have I start whole normalization again in this case or it will just write few notes about conflict in logs and will try apply script to document one more time? 2. Is it possible to receive final result of the normalization job and error description if it fails? Continuous querying _active_tasks looks not optimal and it's possible to miss the moment when normalization fails. 3. Only Erlang and Elixir migration scripts are possible, right? Or it's possible to use any scripts that supports CouchDB stdio communication protocol? I suppose many of us who faced same problem already have Python/Ruby/whatever-lang scripts that successfully handles scheme normalization and well tested. It would be helpful to not force them made whole job again. -- ,,,^..^,,, On Wed, Feb 13, 2013 at 7:04 PM, Алекс Zatvornitskiy <[email protected]> wrote: > Hi everybody! > > A couch_normalizer v0.6 is out! > > The couch_normalizer designed as a standard Apache CouchDB httpd handler > and uses a Rails db migration approach. Written both in Erlang and Elixir. > Works well on production and has a great IO performance. > > For example: > > % Starts a normalization process. > > % curl -v -XPOST -H"Content-Type: application/json" > http://127.0.0.1:5984/db/_normalize > > % => {"ok":"Normalization process has been started (<0.174.0>)."} > > > % Gets a normalization process execution status. > > % curl -v http://127.0.0.1:5984/_active_tasks > > % => > [{"pid":"<0.174.0>","continue":false,"db":"db","docs_conflicted":0,"docs_deleted":0,"docs_normalized":0,"docs_read":3000,"finished_on":1358513508,"num_workers":5,"started_on":1358513508,"type":"normalization","updated_on":1358513508}] > > As a result, it allows to deploy migration scripts (aka scenarios) and > change big amount of documents as fast as possible (without HTTP overhead > and some kind of 'delayed jobs') via internal CouchDB functions, such as > couch_db:open_doc/2, couch_db:update_doc/3 and so on. > > Check more: https://github.com/datahogs/couch_normalizer > > If you want to contribute, feel free to open an Github issue or submit a > pull request or ping me offline) > > It still under scoping and development.
