On Wed, Apr 1, 2009 at 1:51 PM, Michael McDaniel <[email protected]> wrote: > On Thu, Apr 02, 2009 at 12:31:17AM +0700, Jason Smith wrote: >> I'd be very interested to know the performance impact of that >> optimization as well. What is the overhead or bottleneck with large >> view values? Estimating 100 bytes per key/value pair within each of the >> million documents, that's 2GB of raw data, which should write to a >> laptop disk within 2 minutes. >> >> I'm wondering whether it matters how large the view values are, since >> they would seem not to be involved in the view processing very >> much--only written to disk in the order defined by the keys. >> >> Of course, that goes against the common wisdom that the fastest thing to >> do is emit(key, null); but that could impact the application >> significantly since you have to query again for the documents. (I'm >> unsure whether include_docs has a performance penalty either.) >> >> I guess what I'm asking is, why does the value side of views impact >> performance so greatly? > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > last I checked, there is an Erlang term() <-> JSON <-> Erlang term() > conversion for values on the initial view (to/from view server) >
Not to mention the JSON -> JS_Object* -> JSON inside of couchjs though that's probably quite a bit quicker :D > ~M > >> >> kowsik wrote: >>> I would highly recommend that you do emit(doc.field, null) so that the >>> key space doesn't get unwieldy and large. Since the id of the document >>> is part of the map results, you can always fetch it using >>> include_docs=true. >>> >>> K. >>> >>> On Wed, Apr 1, 2009 at 10:12 AM, Manjunath Somashekhar >>> <[email protected]> wrote: >>>> hi All, >>>> >>>> We have been using couchdb (built out of trunk) for prototyping an idea >>>> and would like to thank and congratulate you folks for a simple and usable >>>> schema free db. >>>> >>>> We plan to store few million documents in couchdb and we would like to >>>> create couple of views to fetch the data appropriately. We have inserted a >>>> million documents (each containing about 20 fields). We are >>>> indexing/creating a view on a particular field of the document. The map >>>> function of the view is simple straight forward emit (emit(doc.field, >>>> doc)). It takes about 90 mins to build the required B-Tree index the first >>>> time. All the subsequent queries are performing extremely well (milli >>>> second responses). Can anything be done to reduce the 90 mins taken to >>>> build the required B-Tree index the first time? >>>> >>>> Environment details: >>>> Couchdb - 0.9.0a757326 >>>> Erlang - 5.6.5 >>>> Linux kernel - 2.6.24-23-generic #1 SMP Mon Jan 26 00:13:11 UTC 2009 i686 >>>> GNU/Linux >>>> Ubuntu distribution >>>> Centrino Dual core, 4GB RAM laptop >>>> >>>> Thanks >>>> Manju >>>> >>>> >>>> >>>> >> >> -- >> Jason Smith >> Proven Corporation >> Bangkok, Thailand >> http://www.proven-corporation.com > > -- > Michael McDaniel > Portland, Oregon, USA > http://trip.autosys.us > http://autosys.us > http://mmcdaniel.com/erlview > >
