Re: filtering on timestamp + aggregation on another field

Kinley Dorji Mon, 14 Mar 2011 21:30:14 -0700

Aroj,

I think you will always have choices on how to implement it, with the
final decision resting on CouchDB efficiencies (choice of what should
be keyed and what should be included in values, as Nils has noted) and
what your reporting needs are. Here is another option:


At the map level: emit(doc.timestamp, {doc.country, doc.city,
doc.clinic, doc.beds})

At the reduce level, write your js code to suit your aggregation requirements.

At the view query level, in addition to the start end keys for the
timestamp, add the parameters &group_level=1, 2, 3 etc.

If you select group_level 1 your aggregation would be for country, 2
for city and 3 at the clinic level.

This option gives you choices at query time, but whether it is
suitable for you again depends on the specifics of your requirements.

Similarly, you are presently reflecting timestamp as a date. If you
were to make a compound key for timestamp ie. {year, month, day,
hour}, you could use the same view to query by year and month (which
probably will not be useful from the view point of bed availability)
but might be useful at the hour level (assuming updates on bed
availability are dynamic/real time).

My two pennies.

On Tue, Mar 15, 2011 at 12:28 AM, Nils Breunese <[email protected]> wrote:
> This looks fine to me. To keep the index storage to a minimum I wouldn't 
> store the doc as the value in the view, but only the absolute minimum you 
> need. Hint: the value can even be null (which doesn't take a lot of space to 
> store!) and you can use ?include_docs=true to retrieve the documents with the 
> view query. We use this for almost all of our views, so storage is mostly 
> going to storing the documents themselves and views add little overhead.
>
> Nils.
> ________________________________________
> Van: Aroj George [[email protected]]
> Verzonden: maandag 14 maart 2011 19:13
> Aan: [email protected]; Kinley Dorji
> Onderwerp: Re: filtering on timestamp + aggregation on another field
>
> Thanks for the below.
>
> Another option we came up with is as below,
>
> map:
> for each level in the location hierarchy:
>     emit([level,timestamp],doc)
>
> which will produce something like the below,
>
> *for given documents:*
> { timestamp : 01/01/2011, location : [India, Maharashtra,Pune] , other_attrs
> }
> { timestamp : 01/02/2011, location : [India, Maharashtra,Mumbai] ,
> other_attrs }
>
> *Map output:*
> 1. [India,01/01/2011], doc
> 2. [Maharashtra,01/01/2011], doc
> 3. [Pune,01/01/2011], doc
> 4. [India,01/02/2011], doc
> 5. [Maharashtra,01/02/2011], doc
> 6. [Mumbai,01/02/2011], doc
>
> Now we can have a query like,
> startkey=[India,01/01/2011] & endkey=[India,01/03/2011] & group_level=1
>
> which should give me the documents grouped on India but filtered on
> timestamp..
>
> The question is, is this a good solution? One concern being the number of
> records in the view now is number_of_levels * num_documents
> ie in this case 2 documents * 3 levels = 6 records in the view.
>
> Will couch performance suffer with this approach?
>
> Rgds,
> Aroj
> ------------------------------------------------------------------------
>  VPRO   www.vpro.nl
> ------------------------------------------------------------------------
>

Re: filtering on timestamp + aggregation on another field

Reply via email to