On Thu, Apr 02, 2009 at 12:31:17AM +0700, Jason Smith wrote: > I'd be very interested to know the performance impact of that > optimization as well. What is the overhead or bottleneck with large > view values? Estimating 100 bytes per key/value pair within each of the > million documents, that's 2GB of raw data, which should write to a > laptop disk within 2 minutes. > > I'm wondering whether it matters how large the view values are, since > they would seem not to be involved in the view processing very > much--only written to disk in the order defined by the keys. > > Of course, that goes against the common wisdom that the fastest thing to > do is emit(key, null); but that could impact the application > significantly since you have to query again for the documents. (I'm > unsure whether include_docs has a performance penalty either.) > > I guess what I'm asking is, why does the value side of views impact > performance so greatly? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
last I checked, there is an Erlang term() <-> JSON <-> Erlang term() conversion for values on the initial view (to/from view server) ~M > > kowsik wrote: >> I would highly recommend that you do emit(doc.field, null) so that the >> key space doesn't get unwieldy and large. Since the id of the document >> is part of the map results, you can always fetch it using >> include_docs=true. >> >> K. >> >> On Wed, Apr 1, 2009 at 10:12 AM, Manjunath Somashekhar >> <[email protected]> wrote: >>> hi All, >>> >>> We have been using couchdb (built out of trunk) for prototyping an idea and >>> would like to thank and congratulate you folks for a simple and usable >>> schema free db. >>> >>> We plan to store few million documents in couchdb and we would like to >>> create couple of views to fetch the data appropriately. We have inserted a >>> million documents (each containing about 20 fields). We are >>> indexing/creating a view on a particular field of the document. The map >>> function of the view is simple straight forward emit (emit(doc.field, >>> doc)). It takes about 90 mins to build the required B-Tree index the first >>> time. All the subsequent queries are performing extremely well (milli >>> second responses). Can anything be done to reduce the 90 mins taken to >>> build the required B-Tree index the first time? >>> >>> Environment details: >>> Couchdb - 0.9.0a757326 >>> Erlang - 5.6.5 >>> Linux kernel - 2.6.24-23-generic #1 SMP Mon Jan 26 00:13:11 UTC 2009 i686 >>> GNU/Linux >>> Ubuntu distribution >>> Centrino Dual core, 4GB RAM laptop >>> >>> Thanks >>> Manju >>> >>> >>> >>> > > -- > Jason Smith > Proven Corporation > Bangkok, Thailand > http://www.proven-corporation.com -- Michael McDaniel Portland, Oregon, USA http://trip.autosys.us http://autosys.us http://mmcdaniel.com/erlview
