Forgot to mention... All of these 700 req/sec are write requests (data logging) & no data crunching. -- Mayank http://adomado.com
On Thu, Jun 2, 2011 at 4:57 PM, [email protected] <[email protected]> wrote: > Hi Gabor, > > Thanks for pointing towards the update handlers. Will have a look at them. > > About traffic... > Our current inhouse analytics solution (built on Rails, Mysql) gets > about 700 req/min on an average day... > > -- > Mayank > http://adomado.com > > > > > On Thu, Jun 2, 2011 at 3:16 PM, Gabor Ratky <[email protected]> wrote: >> Take a look at update handlers [1]. It is a more lightweight way to create / >> update your visitor documents, without having to GET the document, modify >> and PUT back the whole thing. It also simplifies dealing with document >> revisions as my understanding is that you should not be running into >> conflicts. >> >> I wouldn't expect any problem handling the concurrent traffic and tracking >> the users, but the view indexer will take some time with the processing >> itself. You can always replicate the database (or parts of it using a >> replication filter) to another CouchDB instance and perform the crunching >> there. >> >> It's fairly vague how much updates / writes your 2k-5k traffic would cause. >> How many requests/sec on your site? How many property updates that causes? >> >> Btw, CouchDB users, is there any way to perform bulk updates using update >> handlers, similar to _bulk_docs? >> >> Gabor >> >> [1] http://wiki.apache.org/couchdb/Document_Update_Handlers >> >> On Thursday, June 2, 2011 at 11:34 AM, [email protected] wrote: >> >>> Hi everyone, >>> >>> I came across couchdb a couple of weeks back & got really excited by >>> the fundamental change it brings by simply taking the app-server out >>> of the picture. >>> Must say, kudos to the dev team! >>> >>> I am planning to write a quick analytics solution for my website - >>> something on the lines of Google analytics - which will measure >>> certain properties of the visitors hitting our site. >>> >>> Since this is my first attempt at a JSON style document store, I >>> thought I'll share the architecture & see if I can make it better (or >>> correct my mistakes before I do them) :-) >>> >>> - For each unique visitor, create a document with his session_id as the >>> doc.id >>> - For each property i need to track about this visitor, I create a >>> key-value pair in the doc created for this visitor >>> - If visitor is a returning user, use the session_id to re-open his >>> doc & keep on modifying the properties >>> - At end of each calculation time period (say 1 hour or 24 hours), I >>> run a cron job which fires the map-reduce jobs by requesting the views >>> over curl/http. >>> >>> A couple of questions based on above architecture... >>> We see concurrent traffic ranging from 2k users to 5k users. >>> - Would a couchdb instance running on a good machine (say High CPU >>> EC2, medium instance) work well with simultaneous writes happening... >>> (visitors browsing, properties changing or getting created) >>> - With a couple of million documents, would I be able to process my >>> views without causing any significant impact to write performance? >>> >>> I think my questions might be biased by the fact that I come from a >>> MySQL/Rails background... :-) >>> >>> Let me know how you guys think about this. >>> >>> Thanks in advance, >>> -- >>> Mayank >>> http://adomado.com >> >> >
