Hi Volker, >> Do you mean "emittin the full document ...."?
You already answered my question: by relying on the SDK, the full documents will be retrieved by using the IDs obtained through the specified view. Thanks for the help! Regards, -- Tito On Jun 16, 2014, at 12:52 AM, Volker Mische <[email protected]> wrote: > Hi Tito, > > On 06/16/2014 06:35 AM, Tito Ciuro wrote: >> Hi, >> >> I've been using CouchDB for a while and now I'm evaluating Couchbase. >> I'm wondering what's the best way to determine when to emit data vs >> null. I typically avoid emitting the whole document is it's too "large" >> (i.e. 1 MB or so) because the index would grow way too much. In this >> case, I tend to emit null and then collect the documents via >> Include_docs. However, if the data set is small (or all I need is a >> subset of the document, then I emit this subset, as it's faster and puts >> less strain on the storage system. There is also the potential for a >> race condition. As per CouchDB's documentation >> (http://wiki.apache.org/couchdb/HTTP%5Fview%5FAPI) >> >> The include_docs option will include the associated document. >> However, the user should keep in mind that there is a race condition >> when using this option. It is possible that between reading the view >> data and fetching the corresponding document that the document has >> changed. If you want to alleviate such concerns you should emit an >> object with a _rev attribute as in emit(key, {"_rev": doc._rev}). >> This alleviates the race condition but leaves the possibility that >> the returned document has been deleted (in which case, it includes >> the "_deleted": true attribute). Note: include_docs will cause a >> single document lookup per returned view result row. This adds >> significant strain on the storage system if you are under high load >> or return a lot of rows per request. If you are concerned about >> this, you can emit the full doc in each row; this will increase view >> index time and space requirements, but will make view reads >> optimally fast. > > > The Couchbase implementation for include_docs is different. If you use an > SDK, it requests the view to get all the IDs and then it fetches the full > docs via a memcache GET. In the upcoming version of Couchbase (3.0) the > original include_docs of the views will completely go away aand it will only > be supported through the SDKS (don't worry the API won't change when you use > the SDKS). > >> Since Couchbase utilizes memcache, storing and retrieving data is a >> whole different game: while in general a CouchDB document should not be >> split and related into other documents (it's not a RDBMS!), it seems to >> be perfectly fine in Couchbase. Because get/set/multiget are cheap >> operations, it's perfectly feasible to "break" a document into smaller >> pieces and retrieve them piecemeal. It seems this would be great for >> memcache because it'd allow to cache the documents that are used the >> most. On the other hand, keeping a document "monolithic" not only makes >> the index larger, but it makes it less efficient to cache (it's an all >> or nothing proposition.) >> >> So it seems that a valid approach in Couchbase would be to: >> >> 1) break "large" documents into smaller, more manageable ones. Retrieve >> them via get/multiget (cheap op) and let memcache cache them as >> efficiently as possible. >> 2) emit small data subsets as needed, as opposed to the entire document >> where possible. >> 3) for those queries where the entire document needs to be retrieved... >> what then?: >> >> 3.1) should we emit null and include_docs=true? >> 3.2) should we emit the entire document instead? > > You would emit null and let the SDK do the rest > >> It's clear that always emitting null in CouchDB puts a lot of pressure >> on the storage system. But what about Couchbase? Are there any best >> practices to be followed? > > Do you mean "emittin the full document ...."? > > Cheers, > Volker > > -- > You received this message because you are subscribed to a topic in the Google > Groups "Couchbase" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/couchbase/Y385HZQ73k0/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Couchbase" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
