Hi Jack,
The strength of a product is in your need for what that product can
provide. Each product has its own characteristics which makes it more or
less useful to your needs. So, not knowing your needs, I can only theorize.
Here is a list of advantages/disadvantages CouchDB may present
(considering only what you spoke about):
1. Update. Requires document ID, which may slow down the performance.
2. Insert. If not update, I peaked at ~6k medium size documents per
second on a "not-so-great" computer (my personal laptop is better than
that computer which is dedicated for tests at low hardware capabilities).
3. Scan. Advantage: given Erlang capabilities and the CouchDB smart way
to read/write from/to a database, it can be performed in parallel with
other requests. Disadvantage: it doesn't provide an accurate response if
more documents were added meanwhile, but the update in views works
faster after that (if you referred to views).
4. Latency. Difficult to answer because I don't know which type of
latency you refer to. The overall latency depends on the type of
request. If you refer to the time of response since the request is sent
till Erlang code gives back a response, I wouldn't go for "the
embarrassing Erlang". Erlang has a nice "habit" to spawn threads with no
sweat and that compensate with the response time gained by other
languages (in Erlang you need to think in parallel threads by default),
especially in long jobs. That's my personal opinion.
5. Functionality. CouchDB has unique replication and document
read/writing system (at least from what I know, but I don't know many
things, so, I might be wrong :) ). You don't have master-slave type of
replication (that makes it unique in my opinion), but only master-master
type of replication. That makes it more suitable for complex sharding
where you need extra control over the data insertion process. The
read/write system is based on two-heads document versioning update. That
decreases by far the probability for the database to be left in an
inconsistent state and makes it recover the databases faster from crashes.
Another drawback in terms of time is compaction. But, as I said, it
depends on your needs. The compaction is useful only when you want to
save space on your storage device. Otherwise, the banking system of
bookkeeping the full history of the operations made by that moment has
its own advantage.
There are more "advantages/disadvantages" one can speak of in terms of
comparing CouchDB with other noSQL products. But all these
"advantages/disadvantages" are subject to one's needs at a given time
(some advantages can turn / be turned into disadvantages and vice-versa,
even for the same user whose needs are dynamic).
I tried to be as objective as possible, so, my feedback to help you in
your decisions. If I mislead you by a wrong information, please, forgive
me, I am just an user.
CGS
On 12/12/2011 02:41 AM, jack chrispoo wrote:
Hi all,
I am new to CouchDB. I and my friends have been evaluating several
datastores including Cassandra, HBase, MongoDB, CouchDB in terms of update,
read, insert, scan throughput and latency. In our tests CouchDB performs
worst in all tests. I once read about some saying that because CouchDB is
written in Erlang, throughput and lantency is not CouchDB's strength. So
can someone tell me some advantages of CouchDB compared to other
datastores? I did look into views, but it seems that other datastores have
similar funtionalities - MongoDB can also execute javascript to generate
result, HBase has filter. So what exactly is CouchDB's strength?
I'll be grateful to any comments, Thanks,
jack