Muji, what happens if you have several hundred transactions per day in a variety of different countries over several years? Then your view processing is going to be very slow. We are looking for a near real-time solution
On Tue, Apr 16, 2013 at 5:42 PM, muji <[email protected]> wrote: > I believe you need to query with startkey and endkey as complex keys > (assuming YYYY-MM-DD): > > startkey=[startyear,startmonth,startday] > endkey=[endyear,endmonth,endday,{}] > > Then you can extract the countries from the key returned with each row (it > will be the last element in the array). You will also need to set the group > view parameter (group_level=4?) for distinct values. > > Then you should not need to write a custom reduce function. > > The startkey and endkey must be proper JSON (and URL) encoded values. > > My understanding is that is the correct approach. > > Cheers! > > > On 16 April 2013 05:46, Andrey Kuprianov <[email protected] > >wrote: > > > Nope, I need distinct values over a period of time. Not per day. > > > > > > On Tue, Apr 16, 2013 at 11:30 AM, Keith Gable < > [email protected] > > >wrote: > > > > > It gives you distinct countries per day. Is that not what you want? > With > > > reduce, it should be really fast once the view is built. > > > On Apr 15, 2013 9:05 PM, "Andrey Kuprianov" < > [email protected] > > > > > > wrote: > > > > > > > @Keith your method will not give me distinct countries and even with > > > reduce > > > > and after being fed to list function it's still slow > > > > > > > > > > > > > > > > On Tue, Apr 16, 2013 at 2:27 AM, Wendall Cada <[email protected]> > > > wrote: > > > > > > > > > I agree with this approach. I do something similar using _sum: > > > > > > > > > > emit([doc.country_name, toDay(doc.timestamp)], 1); > > > > > > > > > > The toDay() method is basically a floor of the day value. Since I > > don't > > > > > store ts in UTC (Because of an idiotic error some years back) I > also > > > do a > > > > > tz offset to correct the day value in my toDay() method. > > > > > > > > > > Using reduce is by far the fastest method for this. I don't see any > > > issue > > > > > with getting this to scale. > > > > > > > > > > Overall, I think I rather prefer the method Keith shows, as it > would > > > > > depend on the values returned in the date object versus other > > possibly > > > > > inaccurate means using math. > > > > > > > > > > Wendall > > > > > > > > > > > > > > > On 04/15/2013 07:18 AM, Keith Gable wrote: > > > > > > > > > >> Output keys like so: > > > > >> > > > > >> [2010, 7, 10, "Australia"] > > > > >> > > > > >> Reduce function would be _count. > > > > >> > > > > >> startkey=[year,month,day,null] > > > > >> endkey=[year,month,day,{}] > > > > >> > > > > >> --- > > > > >> Keith Gable > > > > >> A+, Network+, and Storage+ Certified Professional > > > > >> Apple Certified Technical Coordinator > > > > >> Mobile Application Developer / Web Developer > > > > >> > > > > >> > > > > >> On Sun, Apr 14, 2013 at 8:37 PM, Andrey Kuprianov < > > > > >> [email protected]> wrote: > > > > >> > > > > >> Hi guys, > > > > >>> > > > > >>> Just for the sake of a debate. Here's the question. There are > > > > >>> transactions. > > > > >>> Among all other attributes there's timestamp (when transaction > was > > > > made; > > > > >>> in > > > > >>> seconds) and a country name (from where the transaction was > made). > > > So, > > > > >>> for > > > > >>> instance, > > > > >>> > > > > >>> { > > > > >>> . . . . > > > > >>> "timestamp": 1332806400 > > > > >>> "country_name": "Australia", > > > > >>> . . . . > > > > >>> } > > > > >>> > > > > >>> Question is: how does one get unique / distinct country names in > > > > between > > > > >>> dates? For example, give me all country names in between > > 10-Jul-2010 > > > > and > > > > >>> 21-Jan-2013. > > > > >>> > > > > >>> My solution was to write a custom reduce function and set > > > > >>> reduce_limit=false, so that i can enumerate all countries without > > > > hitting > > > > >>> the overflow exception. It works great! However, such solutions > are > > > > >>> frowned > > > > >>> upon by everyone around. Has anyone a better idea on how to > tackle > > > this > > > > >>> efficiently? > > > > >>> > > > > >>> Andrey > > > > >>> > > > > >>> > > > > > > > > > > > > > > > > > > -- > mischa (aka muji). >
