Re: Suggestions on View performance optimization/improvement

Michael McDaniel Wed, 01 Apr 2009 10:52:12 -0700

On Thu, Apr 02, 2009 at 12:31:17AM +0700, Jason Smith wrote:
> I'd be very interested to know the performance impact of that  
> optimization as well.  What is the overhead or bottleneck with large  
> view values?  Estimating 100 bytes per key/value pair within each of the  
> million documents, that's 2GB of raw data, which should write to a  
> laptop disk within 2 minutes.
>
> I'm wondering whether it matters how large the view values are, since  
> they would seem not to be involved in the view processing very  
> much--only written to disk in the order defined by the keys.
>
> Of course, that goes against the common wisdom that the fastest thing to  
> do is emit(key, null); but that could impact the application  
> significantly since you have to query again for the documents.  (I'm  
> unsure whether include_docs has a performance penalty either.)
>
> I guess what I'm asking is, why does the value side of views impact  
> performance so greatly?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


 last I checked, there is an Erlang term() <-> JSON <-> Erlang term()
 conversion for values on the initial view (to/from view server)

~M

>
> kowsik wrote:
>> I would highly recommend that you do emit(doc.field, null) so that the
>> key space doesn't get unwieldy and large. Since the id of the document
>> is part of the map results, you can always fetch it using
>> include_docs=true.
>>
>> K.
>>
>> On Wed, Apr 1, 2009 at 10:12 AM, Manjunath Somashekhar
>> <[email protected]> wrote:
>>> hi All,
>>>
>>> We have been using couchdb (built out of trunk) for prototyping an idea and 
>>> would like to thank and congratulate you folks for a simple and usable 
>>> schema free db.
>>>
>>> We plan to store few million documents in couchdb and we would like to 
>>> create couple of views to fetch the data appropriately. We have inserted a 
>>> million documents (each containing about 20 fields). We are 
>>> indexing/creating a view on a particular field of the document. The map 
>>> function of the view is simple straight forward emit (emit(doc.field, 
>>> doc)). It takes about 90 mins to build the required B-Tree index the first 
>>> time. All the subsequent queries are performing extremely well (milli 
>>> second responses). Can anything be done to reduce the 90 mins taken to 
>>> build the required B-Tree index the first time?
>>>
>>> Environment details:
>>> Couchdb - 0.9.0a757326
>>> Erlang - 5.6.5
>>> Linux kernel - 2.6.24-23-generic #1 SMP Mon Jan 26 00:13:11 UTC 2009 i686 
>>> GNU/Linux
>>> Ubuntu distribution
>>> Centrino Dual core, 4GB RAM laptop
>>>
>>> Thanks
>>> Manju
>>>
>>>
>>>
>>>
>
> -- 
> Jason Smith
> Proven Corporation
> Bangkok, Thailand
> http://www.proven-corporation.com

-- 
Michael McDaniel
Portland, Oregon, USA
http://trip.autosys.us
http://autosys.us
http://mmcdaniel.com/erlview

Re: Suggestions on View performance optimization/improvement

Reply via email to