Re: The Blog

Paul Davis Mon, 09 Feb 2009 23:19:39 -0800

On Tue, Feb 10, 2009 at 1:34 AM, Mister Donut <[email protected]> wrote:
>> On the opposite end of the spectrum, we have extremely large RDBMS
>> installs on huge iron. IIRC, I read an article that the 37signals crew
>> just bought a 32 GiB machine to scale up Basecamp.
>> the whole system would require many man hours of systems engineering
>> or a huge rewrite of the base application logic.
>
> Yeah but CouchDB doesn't magically solve that problem, does it?
> RDBMS + Memcached goes a very long way.
>


Of course CouchDB doesn't 'magically solve that problem'.

> Also, Basecamp seems to be "easy" to partition, "like" Flickr, (mind
> you, "easy"!), because most accounts are "self-contained". There is a
> project, a few users. They don't overlap. Of course, once you detach
> users from their projects, ... or allow users to comment on
> everything, that's where it gets hard? The problems start when
> everything relates to everything. (see:)
>

The blog post I read was saying exactly the opposite (at least about Basecamp):

http://www.37signals.com/svn/posts/1509-mr-moore-gets-to-punt-on-sharding

>> Another thought that just occurred to me. Another way of describing
>> the difference is that in CouchDB the data is important. In an RDBMS,
>> it's the relations that are important (or the focus at least).
>
> That is a very interesting point. I tend to agree after these emails.
> Also, most example applications of CouchDB that users have presented
> in this very thread seem to be about data, and not about relations.
> The sync to S3, the message queue with included aggregating
> reporting... Whereas a typical web application (wiki, blog),
> everything is about relations. Isn't that so? Users in user groups
> with permissions writing posts belonging to categories, having
> comments by other users. I don't really see how you can just throw
> that out of the window?! I mean, exactly why would you use the
> key/value pairs user_x, permission_x, group_x, entry_x, comment_x
> instead of just five tables. I don't understand where the CouchDB
> implementation shines for just exactly that thing. Mind you, this is
> RDBMS thinking (again), and I totally see the reason to use CouchDB
> for those projects outlined here (S3, Queue).
>

My line from the other day is that CouchDB is relational as the web is
relational. It can have references, but those references can break
just like a hyperlink can. Most people don't spend a huge amount of
time worrying about it. Which begs the question of why some hyperlinks
(intrasite) are orders of magnitude more important than others
(intersite).

Bottom line: Relax.

> I think there is a bit of a problem of the approach here. Most
> "convinced" CouchDB users seem to want to tell you "Well RDBMS suck,
> CouchDB is much better", "Why?", "Well, because... [Concepts,
> Key/Value, Map/Reduce]". Instead of trying to show you a few hands-on
> approaches of solving a problem *that has not been solved* by RDBMS
> yet.
>

Erm. No. On all counts. If you follow the community for a bit longer
you'll see that most people are generally enthused about the feature
set that CouchDB provides. In the general sense most people have a
fairly intuitive feeling for what would be better suited in CouchDB
vs. an RDBMS. There is some gray area and overlap which causes a fair
amount of discussion on particulars (as evidenced by this thread) but
I wouldn't say that requires anyone to solve 'a problem *that has not
been solved* by [an] RDBMS'. Such a demand is preposterous.

> I think the message reporting queue with aggregating, an example like
> that, instead of just "The Blog in 5 Minutes" you see everywhere,
> would go a long way into showing what CouchDB is all about.
>

The "Blog in 5 minutes" while somewhat tired, is instantly understood.
Consider it the "Hello, World!" of Web 2.0 if you must.

> It can show how useful Map/Reduce can be (to create aggregate
> reports), and how you can possibly have two message queues that can
> stay in sync.
>
>> Yes, you can. Just not in the way you're used to thinking. A
>> Map/Reduce view is a fixed *mapping* from documents to a sorted
>> key/value space.
>
> Yes, you just said it. *Fixed*. If you have 200 documents, 100 from
> Jan to Nov, and 100 from Nov to Dec, there is no way you can fill them
> into two buckets ("Jan-Nov" and "Nov-Dec"). It would require variable
> conditions.
>

No, I said 'fixed *mapping*'. And I could write you an example in few
lines of code. But I'm not going to because you've worn my patience
thin.

>> Also, you may count things.
>
> I never said you couldn't. I said you cannot count like += and you
> cannot aggregate counts to get rid of all the documents. Let's say you
> want to count pageviews. Easy, insert a document for every pageview,
> create a "sum-view". But, this will lead to way too many documents?
> Doesn't seem feasible. Of course, CouchDB isn't the tool for that job,
> but I would still like to see some really hands on examples of what
> CouchDB can do.

You'll need to spend more time learning CouchDB before you assume to
know it's limitations and the trade-offs of the approaches that would
be most appropriate for a specific use case.

> I think we covered the concepts now.
>

I don't think I've covered the concepts after months of working with CouchDB.

>> Patrick was trying to help and was correct.
>
> No, he is not.
>

You forgot to quote the part about how being rude and wrong does not
make a good impression. Twice in a row is exceedingly grating.

HTH,
Paul Davis

Re: The Blog

Reply via email to