I made an app that does exactly what you described, although it was a "for-fun" hack just to showcase couch to my buddies (it was called tweetmesexy, so you can imagine how much fun it actually was)... What I ended up doing was:

1)  Each new comment is a new doc that references the tweet id
2) Use view collation to get the tweet(s) and comments via single http call (http://wiki.apache.org/couchdb/View_collation) 3) Run a script (via cron or whatever) to move the comments to the tweet (and delete the comments) when the tweet is no longer "hot". This is not required, but in our case it allowed us to do some nifty analytics thanks to couch's incremental map/reduce

As for if couch is a good fit or "update-heavy" applications, I think an RDBMS has advantages in a true "update" scenario (like 'update stats set counter=counter+1'). But remember, you are only using the word "update" because couch's awesomeness allows you to even consider storing the comments inline with the doc. Technically you can do the same with an SQL database, using a serialized blob and have the same conflict issues (without built-in revision love).

So assuming I'm correct that the structure of your data will be similar if using a SQL database or couch, you would be well served with couch:

1) You can archive the comments inline, as I mentioned above and run cool map/reduce on the tweet and comments together 2) Simple master-master, allowing you to scale writes to your heart's content 3) With SQL you'll need multiple queries (or go the ugly join route) to get the comments and the tweet, vs a single http call

Bottom line, just because you find yourself structuring your data like you would in an SQL database, does not negate the other advantages of couch.

Troy

On Dec 28, 2009, at 10:09 PM, Sean Clark Hess wrote:

Our system will have comments related to live data - imagine people
commenting on tweets right after they are written.

I'm having trouble deciding how to model it. It makes a lot of sense to make one document containing all the comments for each data segment, but we could
theoretically have hundreds of users commenting on the same segment at
once.

Would data consistency become a nightmare? With an RDBMS you would have a
comments table, and insert a new row for each comment - preventing
conflicts. I could do the same thing with couch, by adding a separate
document for each comment, but it seems to violate a fundamental principle
of couch.

Is Couch DB a bad fit for an update-heavy system? Updates will only be heavy
within the first minute or so after the data is released, then it will
switch to a very read-heavy system.

Thanks for your help

Reply via email to