Matthieu, Your summary is spot on as is your explanation of Couchdb's primary/sweet spot use case, thanks!
So if I understand this correctly (and I'm no Einstein as this mailing list proves :-)) your suggestion is to emit the whole of my "maven-build-profile" doc as a KEY and then use a list function to request the data out of it, that way I only have to build the view once and can manipulate the data it presents with list functions and couchdb filters? Assuming I have understood this correctly then does this not mean my view is going to be absolutely massive? Thanks Mike -----Original Message----- From: Matthieu Rakotojaona [mailto:[email protected]] Sent: 24 May 2012 19:04 To: [email protected] Subject: Re: FW: Am I doing something fundamentally wrong? On Thu, May 24, 2012 at 8:32 AM, Mike Kimber <[email protected]> wrote: > The "Build Profile Detail" Map referenced above takes up to 15 hours to build. > Now once I know what I want that's not necessarily a major issue, but it is > when I need to discover/explore the data that I need to analyse. Ok, so what you want is a tool to retrieve information dynamically from a store. I don't think CouchDB is your best bet on this. From what I saw, CouchDB is much more oriented on storing and accessing static data, which might be derived from some other static data. It's kind of like a static site generator : you put in some content (your blog post, your logo, ...) and it will generate (with map/reduce) static HTML pages that you will serve directly. You can do some post-processing on them, but it will be live. If you're going to analyse the initial content dynamically, you'll have to regenerate the pages every time; this is not the best way to go. My comparison might seem far-fetched, but I hope you understand my point. But there is something you can do with CouchDB. Basically what you want is analyze the 'maven-build-profile' docs. What you can do in your map is just `emit("maven-build-profile",null)`. This will give you what you need to filter the docs. The next step is to fiddle with the list function. You just have to put your processing in this list function : ``` function(head, req) { var row; while(row = getRow()) { var doc = row.doc; send({"property1": doc.property1, "property2": doc.property2}); } } ``` This simple list function will give you properties 1 and 2 for each of the docs that are processed. Two words of caution : you will need to add `include_docs=true` to your query string, so that the getRow() gets the row _with_ the doc. You will also need to do some newlines and some commas, because send() doesn't add it : it just feeds the output with what it has as an argument without further processing. This kind of workflow should be flexible enough to have some interesting results, even though it could use a lot of CPU for each request. Note : you never need to emit the doc._id; it is always included in every pair you emit (it is included in the row you have at query time). If you want to sort by id, emit `null` as a key : they will still be sorted by ids by default (yes, CouchDB is awesome =]). -- Matthieu RAKOTOJAONA
