On 2 Jun 2011, at 13:28, [email protected] wrote: > Forgot to mention... > All of these 700 req/sec are write requests (data logging) & no data > crunching. > Our current inhouse analytics solution (built on Rails, Mysql) gets >> >> about 700 req/min on an average day...
min or sec? :) Cheers Jan -- >> >> -- >> Mayank >> http://adomado.com >> >> >> >> >> On Thu, Jun 2, 2011 at 3:16 PM, Gabor Ratky <[email protected]> wrote: >>> Take a look at update handlers [1]. It is a more lightweight way to create >>> / update your visitor documents, without having to GET the document, modify >>> and PUT back the whole thing. It also simplifies dealing with document >>> revisions as my understanding is that you should not be running into >>> conflicts. >>> >>> I wouldn't expect any problem handling the concurrent traffic and tracking >>> the users, but the view indexer will take some time with the processing >>> itself. You can always replicate the database (or parts of it using a >>> replication filter) to another CouchDB instance and perform the crunching >>> there. >>> >>> It's fairly vague how much updates / writes your 2k-5k traffic would cause. >>> How many requests/sec on your site? How many property updates that causes? >>> >>> Btw, CouchDB users, is there any way to perform bulk updates using update >>> handlers, similar to _bulk_docs? >>> >>> Gabor >>> >>> [1] http://wiki.apache.org/couchdb/Document_Update_Handlers >>> >>> On Thursday, June 2, 2011 at 11:34 AM, [email protected] wrote: >>> >>>> Hi everyone, >>>> >>>> I came across couchdb a couple of weeks back & got really excited by >>>> the fundamental change it brings by simply taking the app-server out >>>> of the picture. >>>> Must say, kudos to the dev team! >>>> >>>> I am planning to write a quick analytics solution for my website - >>>> something on the lines of Google analytics - which will measure >>>> certain properties of the visitors hitting our site. >>>> >>>> Since this is my first attempt at a JSON style document store, I >>>> thought I'll share the architecture & see if I can make it better (or >>>> correct my mistakes before I do them) :-) >>>> >>>> - For each unique visitor, create a document with his session_id as the >>>> doc.id >>>> - For each property i need to track about this visitor, I create a >>>> key-value pair in the doc created for this visitor >>>> - If visitor is a returning user, use the session_id to re-open his >>>> doc & keep on modifying the properties >>>> - At end of each calculation time period (say 1 hour or 24 hours), I >>>> run a cron job which fires the map-reduce jobs by requesting the views >>>> over curl/http. >>>> >>>> A couple of questions based on above architecture... >>>> We see concurrent traffic ranging from 2k users to 5k users. >>>> - Would a couchdb instance running on a good machine (say High CPU >>>> EC2, medium instance) work well with simultaneous writes happening... >>>> (visitors browsing, properties changing or getting created) >>>> - With a couple of million documents, would I be able to process my >>>> views without causing any significant impact to write performance? >>>> >>>> I think my questions might be biased by the fact that I come from a >>>> MySQL/Rails background... :-) >>>> >>>> Let me know how you guys think about this. >>>> >>>> Thanks in advance, >>>> -- >>>> Mayank >>>> http://adomado.com >>> >>> >>
