On Sun, Jan 24, 2010 at 12:09 AM, Chris Anderson <[email protected]> wrote:

> Devs,
>
> I've been thinking there are a few simple options that would magnify
> the power of the replicator a lot.
>
> ...
> The fun one is chained map reduce. It occurred to me the other night
> that simplest way to present a chainable map reduce abstraction to
> users is through the replicator. The action "copy these view rows to a
> new db" is a natural fit for the replicator. I imagine this would be
> super useful to people doing big messy data munging, and it wouldn't
> be too hard for the replicator to handle.
>
>
I like this idea as well, as chainable map/reduce has been something I think
a lot of people would like to use.  The thing I am concerned about, and
which is related to another ongoing thread, is the size of views on disk and
the slowness of generating them.  I fear that we would end up ballooning
views on disk to a size that is unmanageable if we chained them.  I have an
app in production with 50m rows, whose DB has grown to >100GB, and the views
take up approx 800GB (!). I don't think I could afford the disk space to
even consider using this especially when you consider that in order to
compact a DB or view you need roughly 2x the disk space of the files on
disk.

I also worry about the time to generate chained views, when the time needed
for generating views currently is already a major weak point of CouchDB
(Generating my views took more than a week).

In practice, I think only those with relatively small DB's would be able to
take advantage of this feature.

Reply via email to