On Mar 8, 2010, at 12:28 AM, Gregory Tappero wrote: > Thanks, > > I got the wanted result with > http://friendpaste.com/6sYxT4cNJ9IjpWiW9qgCut > > benoitc came to my rescue. >
The will be a problem with large databases. When the # of unique users is large, the group=false query would return a very large object with all the users names in it. Except it won't because it will raise a reduce_overflow_error. Your problem is interesting. You might learn from reading this paper: http://labs.google.com/papers/sawzall.html It gives a survey of the available algorithms which can work in constant space even over large databases. Chris > Greg > > > > > On Mon, Mar 8, 2010 at 9:07 AM, Paweł Stawicki <[email protected]> > wrote: >> Hmm... I'm just thinking now, don't know if it works, but maybe try >> something like this: >> If you can get number of documents per day per username, first try to make >> this number always one if keys is [date, username]: >> Reduce: >> if (keys.length == 2) { >> return 1; >> } else if (keys.length == 1) { //date only, return number of usernames >> return values.length(); >> } >> >> The risk is that some usernames will count twice, but maybe try it. >> >> Best regards >> -- >> Paweł Stawicki >> http://pawelstawicki.blogspot.com >> http://szczecin.jug.pl >> >> >> >> On Mon, Mar 8, 2010 at 08:03, Gregory Tappero <[email protected]> wrote: >> >>> My number of keys is 4, year month day userame so returning the bbr of >>> keys in reduce does not seem to give me the output i am looking for. >>> Unless i misunderstood something. >>> >>> Thank you for helping, >>> >>> Greg >>> >>> On Mon, Mar 8, 2010 at 12:28 AM, Randall Leeds <[email protected]> >>> wrote: >>>> I'm not an expert on this, but I think you need to create your own >>>> reduce function and output the number of keys rather than the sum of >>>> the values. >>>> >>>> On Sun, Mar 7, 2010 at 15:15, Gregory Tappero <[email protected]> wrote: >>>>> Thank you Pawel, >>>>> >>>>> If i try to follow your way it gives me the count of docs in a given >>>>> day for each username, what i would like is the count of unique >>>>> usernames for a given day. >>>>> >>>>> function(doc) { >>>>> >>>>> if (doc.doc_type=="EdoPing" && doc.em_type==0) { >>>>> date = new Date().setRFC3339(doc.created_at); >>>>> emit([date.getFullYear(), parseInt(date.getMonth())+1, >>>>> date.getDate(), doc.em_uname] , 1); >>>>> >>>>> } >>>>> } >>>>> >>>>> Reduce: >>>>> _count >>>>> >>>>> ================= >>>>> I get: >>>>> >>>>> [2010, 3, 3, "student1"] 5 >>>>> [2010, 3, 4, "student1"] 18 >>>>> [2010, 3, 5, "eong"] 77 >>>>> [2010, 3, 6, "bkante"] 71 >>>>> [2010, 3, 6, "jfrancillette"] 72 >>>>> [2010, 3, 6, "mlouviers"] 12 >>>>> [2010, 3, 7, "student1"] 4 >>>>> >>>>> I would like to extract the following >>>>> >>>>> [2010, 3, 3] 1 >>>>> [2010, 3, 4] 1 >>>>> [2010, 3, 5] 1 >>>>> [2010, 3, 6] 3 >>>>> [2010, 3, 7] 1 >>>>> >>>>> >>>>> if i do a group_level=3 it sum the values. >>>>> >>>>> {"key":[2010,3,3],"value":5}, >>>>> {"key":[2010,3,4],"value":18}, >>>>> {"key":[2010,3,5],"value":77}, >>>>> {"key":[2010,3,6],"value":155}, >>>>> {"key":[2010,3,7],"value":4} >>>>> >>>>> How can i count the unique username emitter per day ? >>>>> >>>>> >>>>> >>>>> >>>>> On Sun, Mar 7, 2010 at 10:02 PM, Paweł Stawicki < >>> [email protected]> wrote: >>>>>> Just emit all documents with em_type = 0 in map function, with [date, >>>>>> em_uname] as key. Then count in reduce. >>>>>> >>>>>> Map: >>>>>> function(doc) { >>>>>> if (doc.em_type = 0) { >>>>>> //If you only want to count, you can emit anything (e.g. 1) instead >>> of >>>>>> doc here. >>>>>> emit([date, em_uname], doc); >>>>>> } >>>>>> } >>>>>> >>>>>> Reduce: >>>>>> function(keys, values, rereduce) { >>>>>> if (!rereduce) { >>>>>> return count_of_values; >>>>>> } else { >>>>>> return sum_of_values; >>>>>> } >>>>>> >>>>>> //If you return 1 from emit instead of doc, then count_of_values == >>>>>> sum_of_values >>>>>> } >>>>>> >>>>>> Then you can handle everything by grouping: >>>>>> http://yourserver:5984/yourdb/_view/yourview?group_level=2 >>>>>> or group=true >>>>>> >>>>>> Regards >>>>>> -- >>>>>> Paweł Stawicki >>>>>> http://pawelstawicki.blogspot.com >>>>>> http://szczecin.jug.pl >>>>>> >>>>>> >>>>>> >>>>>> On Sat, Mar 6, 2010 at 16:26, Gregory Tappero <[email protected]> >>> wrote: >>>>>> >>>>>>> Hello everyone, >>>>>>> >>>>>>> I have the following EdoPing 's type of documents >>>>>>> >>>>>>> { >>>>>>> "_id": "22add509c1e7bc286832edc5bfe99ce5", >>>>>>> "_rev": "1-49663ab8778f445e481143120d0d7086", >>>>>>> "doc_type": "EdoPing", >>>>>>> "em_uname": "student1", >>>>>>> "em_gid": 1, >>>>>>> "created_at": "2010-03-03T14:18:19Z", >>>>>>> "em_ip": "92.154.70.148", >>>>>>> "em_type": 0, >>>>>>> "room_url": "z2fudcvcrfa3reaydatre", >>>>>>> "room_users": [ >>>>>>> "tutorsbox" >>>>>>> ] >>>>>>> } >>>>>>> >>>>>>> i would like to count all unique em_uname of em_type 0 on a given day >>> date. >>>>>>> >>>>>>> For now i used this map/reduce >>>>>>> http://friendpaste.com/5xUUQ26bbl9d5KRB8eojwe >>>>>>> >>>>>>> Date.prototype.setRFC3339 = function(dString){ >>>>>>> var regexp = >>>>>>> >>>>>>> >>> /(\d\d\d\d)(-)?(\d\d)(-)?(\d\d)(T)?(\d\d)(:)?(\d\d)(:)?(\d\d)(\.\d+)?(Z|([+-])(\d\d)(:)?(\d\d))/; >>>>>>> >>>>>>> if (dString.toString().match(new RegExp(regexp))) { >>>>>>> var d = dString.match(new RegExp(regexp)); >>>>>>> var offset = 0; >>>>>>> >>>>>>> this.setUTCDate(1); >>>>>>> this.setUTCFullYear(parseInt(d[1],10)); >>>>>>> this.setUTCMonth(parseInt(d[3],10) - 1); >>>>>>> this.setUTCDate(parseInt(d[5],10)); >>>>>>> this.setUTCHours(parseInt(d[7],10)); >>>>>>> this.setUTCMinutes(parseInt(d[9],10)); >>>>>>> this.setUTCSeconds(parseInt(d[11],10)); >>>>>>> if (d[12]) >>>>>>> this.setUTCMilliseconds(parseFloat(d[12]) * 1000); >>>>>>> else >>>>>>> this.setUTCMilliseconds(0); >>>>>>> if (d[13] != 'Z') { >>>>>>> offset = (d[15] * 60) + parseInt(d[17],10); >>>>>>> offset *= ((d[14] == '-') ? -1 : 1); >>>>>>> this.setTime(this.getTime() - offset * 60 * 1000); >>>>>>> } >>>>>>> } else { >>>>>>> this.setTime(Date.parse(dString)); >>>>>>> } >>>>>>> return this; >>>>>>> }; >>>>>>> >>>>>>> var seenKeys = new Array(); >>>>>>> >>>>>>> function(doc) { >>>>>>> >>>>>>> >>>>>>> if (doc.doc_type=="EdoPing" && doc.em_type==0) { >>>>>>> date = new Date().setRFC3339(doc.created_at); >>>>>>> var key = doc.em_uname + >>> String(doc.created_at).substring(0,10); >>>>>>> if (seenKeys[key] == undefined ) { >>>>>>> seenKeys[key] = 1; >>>>>>> emit([date.getFullYear(), parseInt(date.getMonth())+1, >>>>>>> date.getDate() ] , 1); >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> >>>>>>> It works when saved for this first time but as soon as new EdoPings >>>>>>> get added it starts emitting rows it has already seen ! (same key) >>>>>>> creating faulty count results. >>>>>>> >>>>>>> Is it ok to have seenKeys outside of the doc function() ? >>>>>>> What other way could i use to get the same results ? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Greg >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Greg Tappero >>>>> CTO co founder Edoboard >>>>> http://www.edoboard.com >>>>> +33 0645764425 >>>>> >>>> >>> >>> >>> >>> -- >>> Greg Tappero >>> CTO co founder Edoboard >>> http://www.edoboard.com >>> +33 0645764425 >>> >> > > > > -- > Greg Tappero > CTO co founder Edoboard > http://www.edoboard.com > +33 0645764425
