It wouldn't consume too much space as long as your regularly compacting your database.
As much as I love CouchDB and this is a CouchDB users mailing list, I tried to do something similar and I found MongoDB was better suited due to it's support for partial updates. I based the project of some of the work from http://hummingbirdstats.com/. --Scott On Mon, Sep 12, 2011 at 8:34 AM, [email protected] <[email protected]>wrote: > Hi everyone, > > Considering that I've bypassed the problem of cross-domain communication > using proxy/iframes... > > I want to store counters in a document, incremented on each page view. > CouchDB will create a complete revision of this document for just 1 counter > update. > > Wouldn't this consume too much space? > Considering that I have 1M hits in a day, I might be looking at 1M > revisions > to the document in a day. > > Any thoughts on this... > > Thanks! > -- > Mayank > http://adomado.com > > > > On Fri, Jun 3, 2011 at 12:45 PM, Stefan Matheis < > [email protected]> wrote: > > > What about proxying couch.foo.com through foo.com/couch? maybe not the > > complete service, at least one "special" url which triggers the write > > on couch? > > > > Regards > > Stefan > > > > On Fri, Jun 3, 2011 at 8:56 AM, [email protected] <[email protected]> > > wrote: > > > Hi everyone, > > > > > > I think I had a fundamental flaw in my assumption - realized this > > > yesterday... > > > If the couchdb analytics server is hosted on couch.foo.com(foo.combeing > > > the main site) - I would never be able to make write requests via > client > > > side javascript as cross-domain policy would be a barrier. > > > > > > I thought about this - and came across a potential solution... > > > What if, I host an html page as an attachment in couchdb & whenever I > > have > > > to make a write call, include this html in an iframe & pass on the > > > parameters in the query string of iframe URL. > > > The iframe will have javascript which understands the incoming query > > string > > > params & takes action (creates POST/PUT to couchdb). > > > > > > There would be no cross-domain barriers as the html page is being > served > > > right out of couchdb itself - where ever its hosted (couch.foo.com) > > > > > > This might not be a performance hit - as etags will help in client-side > > > caching of the html page. > > > -- > > > Mayank > > > http://adomado.com > > > > > > > > > > > > On Thu, Jun 2, 2011 at 8:34 PM, [email protected] <[email protected]> > > wrote: > > > > > >> Its 700 req/min :) > > >> -- > > >> Mayank > > >> http://adomado.com > > >> > > >> > > >> > > >> On Thu, Jun 2, 2011 at 7:10 PM, Jan Lehnardt <[email protected]> wrote: > > >> > > >>> > > >>> On 2 Jun 2011, at 13:28, [email protected] wrote: > > >>> > > >>> > Forgot to mention... > > >>> > All of these 700 req/sec are write requests (data logging) & no > data > > >>> crunching. > > >>> > Our current inhouse analytics solution (built on Rails, Mysql) gets > > >>> >> > > >>> >> about 700 req/min on an average day... > > >>> > > >>> min or sec? :) > > >>> > > >>> Cheers > > >>> Jan > > >>> -- > > >>> > > >>> > > >>> >> > > >>> >> -- > > >>> >> Mayank > > >>> >> http://adomado.com > > >>> >> > > >>> >> > > >>> >> > > >>> >> > > >>> >> On Thu, Jun 2, 2011 at 3:16 PM, Gabor Ratky <[email protected] > > > > >>> wrote: > > >>> >>> Take a look at update handlers [1]. It is a more lightweight way > to > > >>> create / update your visitor documents, without having to GET the > > document, > > >>> modify and PUT back the whole thing. It also simplifies dealing with > > >>> document revisions as my understanding is that you should not be > > running > > >>> into conflicts. > > >>> >>> > > >>> >>> I wouldn't expect any problem handling the concurrent traffic and > > >>> tracking the users, but the view indexer will take some time with the > > >>> processing itself. You can always replicate the database (or parts of > > it > > >>> using a replication filter) to another CouchDB instance and perform > the > > >>> crunching there. > > >>> >>> > > >>> >>> It's fairly vague how much updates / writes your 2k-5k traffic > > would > > >>> cause. How many requests/sec on your site? How many property updates > > that > > >>> causes? > > >>> >>> > > >>> >>> Btw, CouchDB users, is there any way to perform bulk updates > using > > >>> update handlers, similar to _bulk_docs? > > >>> >>> > > >>> >>> Gabor > > >>> >>> > > >>> >>> [1] http://wiki.apache.org/couchdb/Document_Update_Handlers > > >>> >>> > > >>> >>> On Thursday, June 2, 2011 at 11:34 AM, [email protected] wrote: > > >>> >>> > > >>> >>>> Hi everyone, > > >>> >>>> > > >>> >>>> I came across couchdb a couple of weeks back & got really > excited > > by > > >>> >>>> the fundamental change it brings by simply taking the app-server > > out > > >>> >>>> of the picture. > > >>> >>>> Must say, kudos to the dev team! > > >>> >>>> > > >>> >>>> I am planning to write a quick analytics solution for my website > - > > >>> >>>> something on the lines of Google analytics - which will measure > > >>> >>>> certain properties of the visitors hitting our site. > > >>> >>>> > > >>> >>>> Since this is my first attempt at a JSON style document store, I > > >>> >>>> thought I'll share the architecture & see if I can make it > better > > (or > > >>> >>>> correct my mistakes before I do them) :-) > > >>> >>>> > > >>> >>>> - For each unique visitor, create a document with his session_id > > as > > >>> the doc.id > > >>> >>>> - For each property i need to track about this visitor, I create > a > > >>> >>>> key-value pair in the doc created for this visitor > > >>> >>>> - If visitor is a returning user, use the session_id to re-open > > his > > >>> >>>> doc & keep on modifying the properties > > >>> >>>> - At end of each calculation time period (say 1 hour or 24 > hours), > > I > > >>> >>>> run a cron job which fires the map-reduce jobs by requesting the > > >>> views > > >>> >>>> over curl/http. > > >>> >>>> > > >>> >>>> A couple of questions based on above architecture... > > >>> >>>> We see concurrent traffic ranging from 2k users to 5k users. > > >>> >>>> - Would a couchdb instance running on a good machine (say High > CPU > > >>> >>>> EC2, medium instance) work well with simultaneous writes > > happening... > > >>> >>>> (visitors browsing, properties changing or getting created) > > >>> >>>> - With a couple of million documents, would I be able to process > > my > > >>> >>>> views without causing any significant impact to write > performance? > > >>> >>>> > > >>> >>>> I think my questions might be biased by the fact that I come > from > > a > > >>> >>>> MySQL/Rails background... :-) > > >>> >>>> > > >>> >>>> Let me know how you guys think about this. > > >>> >>>> > > >>> >>>> Thanks in advance, > > >>> >>>> -- > > >>> >>>> Mayank > > >>> >>>> http://adomado.com > > >>> >>> > > >>> >>> > > >>> >> > > >>> > > >>> > > >> > > > > > >
