Matthieu, 

Your summary is spot on as is your explanation of Couchdb's primary/sweet spot 
use case, thanks! 

So if I understand this correctly (and I'm no Einstein as this mailing list 
proves :-)) your suggestion is to emit the whole of my "maven-build-profile" 
doc as a KEY and then use a list function to request the data out of it, that 
way I only have to build the view once and can manipulate the data it presents 
with list functions and couchdb filters?

Assuming I have understood this correctly then does this not mean my view is 
going to be absolutely massive?

Thanks 

Mike  

-----Original Message-----
From: Matthieu Rakotojaona [mailto:[email protected]] 
Sent: 24 May 2012 19:04
To: [email protected]
Subject: Re: FW: Am I doing something fundamentally wrong?

On Thu, May 24, 2012 at 8:32 AM, Mike Kimber <[email protected]> wrote:
> The "Build Profile Detail" Map referenced above takes up to 15 hours to build.
> Now once I know what I want that's not necessarily a major issue, but it is
> when I need to discover/explore the data that I need to analyse.

Ok, so what you want is a tool to retrieve information dynamically
from a store. I don't think CouchDB is your best bet on this. From
what I saw, CouchDB is much more oriented on storing and accessing
static data, which might be derived from some other static data. It's
kind of like a static site generator : you put in some content (your
blog post, your logo, ...) and it will generate (with map/reduce)
static HTML pages that you will serve directly. You can do some
post-processing on them, but it will be live. If you're going to
analyse the initial content dynamically, you'll have to regenerate the
pages every time; this is not the best way to go.
My comparison might seem far-fetched, but I hope you understand my point.

But there is something you can do with CouchDB. Basically what you
want is analyze the 'maven-build-profile' docs. What you can do in
your map is just `emit("maven-build-profile",null)`. This will give
you what you need to filter the docs.
The next step is to fiddle with the list function. You just have to
put your processing in this list function :

```
function(head, req) {
  var row;
  while(row = getRow()) {
    var doc = row.doc;

    send({"property1": doc.property1, "property2": doc.property2});

  }
}
```

This simple list function will give you properties 1 and 2 for each of
the docs that are processed. Two words of caution : you will need to
add `include_docs=true` to your query string, so that the getRow()
gets the row _with_ the doc. You will also need to do some newlines
and some commas, because send() doesn't add it : it just feeds the
output with what it has as an argument without further processing.

This kind of workflow should be flexible enough to have some
interesting results, even though it could use a lot of CPU for each
request.

Note : you never need to emit the doc._id; it is always included in
every pair you emit (it is included in the row you have at query
time). If you want to sort by id, emit `null` as a key : they will
still be sorted by ids by default (yes, CouchDB is awesome =]).

-- 
Matthieu RAKOTOJAONA

Reply via email to