Re: Suggestions on View performance optimization/improvement

Paul Davis Wed, 01 Apr 2009 10:55:28 -0700

On Wed, Apr 1, 2009 at 1:51 PM, Michael McDaniel <[email protected]> wrote:
> On Thu, Apr 02, 2009 at 12:31:17AM +0700, Jason Smith wrote:
>> I'd be very interested to know the performance impact of that
>> optimization as well.  What is the overhead or bottleneck with large
>> view values?  Estimating 100 bytes per key/value pair within each of the
>> million documents, that's 2GB of raw data, which should write to a
>> laptop disk within 2 minutes.
>>
>> I'm wondering whether it matters how large the view values are, since
>> they would seem not to be involved in the view processing very
>> much--only written to disk in the order defined by the keys.
>>
>> Of course, that goes against the common wisdom that the fastest thing to
>> do is emit(key, null); but that could impact the application
>> significantly since you have to query again for the documents.  (I'm
>> unsure whether include_docs has a performance penalty either.)
>>
>> I guess what I'm asking is, why does the value side of views impact
>> performance so greatly?
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>  last I checked, there is an Erlang term() <-> JSON <-> Erlang term()
>  conversion for values on the initial view (to/from view server)
>


Not to mention the JSON -> JS_Object* -> JSON inside of couchjs though
that's probably quite a bit quicker :D

> ~M
>
>>
>> kowsik wrote:
>>> I would highly recommend that you do emit(doc.field, null) so that the
>>> key space doesn't get unwieldy and large. Since the id of the document
>>> is part of the map results, you can always fetch it using
>>> include_docs=true.
>>>
>>> K.
>>>
>>> On Wed, Apr 1, 2009 at 10:12 AM, Manjunath Somashekhar
>>> <[email protected]> wrote:
>>>> hi All,
>>>>
>>>> We have been using couchdb (built out of trunk) for prototyping an idea 
>>>> and would like to thank and congratulate you folks for a simple and usable 
>>>> schema free db.
>>>>
>>>> We plan to store few million documents in couchdb and we would like to 
>>>> create couple of views to fetch the data appropriately. We have inserted a 
>>>> million documents (each containing about 20 fields). We are 
>>>> indexing/creating a view on a particular field of the document. The map 
>>>> function of the view is simple straight forward emit (emit(doc.field, 
>>>> doc)). It takes about 90 mins to build the required B-Tree index the first 
>>>> time. All the subsequent queries are performing extremely well (milli 
>>>> second responses). Can anything be done to reduce the 90 mins taken to 
>>>> build the required B-Tree index the first time?
>>>>
>>>> Environment details:
>>>> Couchdb - 0.9.0a757326
>>>> Erlang - 5.6.5
>>>> Linux kernel - 2.6.24-23-generic #1 SMP Mon Jan 26 00:13:11 UTC 2009 i686 
>>>> GNU/Linux
>>>> Ubuntu distribution
>>>> Centrino Dual core, 4GB RAM laptop
>>>>
>>>> Thanks
>>>> Manju
>>>>
>>>>
>>>>
>>>>
>>
>> --
>> Jason Smith
>> Proven Corporation
>> Bangkok, Thailand
>> http://www.proven-corporation.com
>
> --
> Michael McDaniel
> Portland, Oregon, USA
> http://trip.autosys.us
> http://autosys.us
> http://mmcdaniel.com/erlview
>
>

Re: Suggestions on View performance optimization/improvement

Reply via email to