On 2 Jun 2011, at 13:28, [email protected] wrote:

> Forgot to mention...
> All of these 700 req/sec are write requests (data logging) & no data 
> crunching.
> Our current inhouse analytics solution (built on Rails, Mysql) gets
>> 
>> about 700 req/min on an average day...

min or sec? :)

Cheers
Jan
-- 


>> 
>> --
>> Mayank
>> http://adomado.com
>> 
>> 
>> 
>> 
>> On Thu, Jun 2, 2011 at 3:16 PM, Gabor Ratky <[email protected]> wrote:
>>> Take a look at update handlers [1]. It is a more lightweight way to create 
>>> / update your visitor documents, without having to GET the document, modify 
>>> and PUT back the whole thing. It also simplifies dealing with document 
>>> revisions as my understanding is that you should not be running into 
>>> conflicts.
>>> 
>>> I wouldn't expect any problem handling the concurrent traffic and tracking 
>>> the users, but the view indexer will take some time with the processing 
>>> itself. You can always replicate the database (or parts of it using a 
>>> replication filter) to another CouchDB instance and perform the crunching 
>>> there.
>>> 
>>> It's fairly vague how much updates / writes your 2k-5k traffic would cause. 
>>> How many requests/sec on your site? How many property updates that causes?
>>> 
>>> Btw, CouchDB users, is there any way to perform bulk updates using update 
>>> handlers, similar to _bulk_docs?
>>> 
>>> Gabor
>>> 
>>> [1] http://wiki.apache.org/couchdb/Document_Update_Handlers
>>> 
>>> On Thursday, June 2, 2011 at 11:34 AM, [email protected] wrote:
>>> 
>>>> Hi everyone,
>>>> 
>>>> I came across couchdb a couple of weeks back & got really excited by
>>>> the fundamental change it brings by simply taking the app-server out
>>>> of the picture.
>>>> Must say, kudos to the dev team!
>>>> 
>>>> I am planning to write a quick analytics solution for my website -
>>>> something on the lines of Google analytics - which will measure
>>>> certain properties of the visitors hitting our site.
>>>> 
>>>> Since this is my first attempt at a JSON style document store, I
>>>> thought I'll share the architecture & see if I can make it better (or
>>>> correct my mistakes before I do them) :-)
>>>> 
>>>> - For each unique visitor, create a document with his session_id as the 
>>>> doc.id
>>>> - For each property i need to track about this visitor, I create a
>>>> key-value pair in the doc created for this visitor
>>>> - If visitor is a returning user, use the session_id to re-open his
>>>> doc & keep on modifying the properties
>>>> - At end of each calculation time period (say 1 hour or 24 hours), I
>>>> run a cron job which fires the map-reduce jobs by requesting the views
>>>> over curl/http.
>>>> 
>>>> A couple of questions based on above architecture...
>>>> We see concurrent traffic ranging from 2k users to 5k users.
>>>> - Would a couchdb instance running on a good machine (say High CPU
>>>> EC2, medium instance) work well with simultaneous writes happening...
>>>> (visitors browsing, properties changing or getting created)
>>>> - With a couple of million documents, would I be able to process my
>>>> views without causing any significant impact to write performance?
>>>> 
>>>> I think my questions might be biased by the fact that I come from a
>>>> MySQL/Rails background... :-)
>>>> 
>>>> Let me know how you guys think about this.
>>>> 
>>>> Thanks in advance,
>>>> --
>>>> Mayank
>>>> http://adomado.com
>>> 
>>> 
>> 

Reply via email to