On Sun, Jan 24, 2010 at 12:09 AM, Chris Anderson <[email protected]> wrote:
> Devs, > > I've been thinking there are a few simple options that would magnify > the power of the replicator a lot. > > ... > The fun one is chained map reduce. It occurred to me the other night > that simplest way to present a chainable map reduce abstraction to > users is through the replicator. The action "copy these view rows to a > new db" is a natural fit for the replicator. I imagine this would be > super useful to people doing big messy data munging, and it wouldn't > be too hard for the replicator to handle. > > I like this idea as well, as chainable map/reduce has been something I think a lot of people would like to use. The thing I am concerned about, and which is related to another ongoing thread, is the size of views on disk and the slowness of generating them. I fear that we would end up ballooning views on disk to a size that is unmanageable if we chained them. I have an app in production with 50m rows, whose DB has grown to >100GB, and the views take up approx 800GB (!). I don't think I could afford the disk space to even consider using this especially when you consider that in order to compact a DB or view you need roughly 2x the disk space of the files on disk. I also worry about the time to generate chained views, when the time needed for generating views currently is already a major weak point of CouchDB (Generating my views took more than a week). In practice, I think only those with relatively small DB's would be able to take advantage of this feature.
