[CODE4LIB] CouchDB and MongoDB (was: Re: O'Reilly books...)

2010-12-14 Thread Luciano Ramalho
On Tue, Dec 14, 2010 at 11:58 AM, Bill Dueber b...@dueber.com wrote:
 Oops. I just found a better overview than I can provide, at
 http://www.mongodb.org/display/DOCS/Comparing+Mongo+DB+and+Couch+DB

I was just about to send that link.

 There are lots of other players in this space, too -- see
 http://nosql-database.org/

It depends on how you define that space. There are lots of players
in the non-relational, AKA, NoSQL space, but in the document oriented
space I don't know of any other current contender other than MongoDB
and CouchDB. Do you?

Comparing Riak, Cassandra and MongoDB is like comparing a golf cart, a
fork lift and a fire engine. They are just too different.

But i'd say MongoDB and CouchDB belong in the same category, though
MongoDB is optimized for performance in cluster, deployed in a single
datacenter, with master-slave replication, and CouchDB is designed for
easy and reliable distributed deployment with master-master
replication among nodes that are not always online.

Their conceptual data model is very similar (JSON and BSON), so it's a
snap to migrate data from CouchDB to MongoDB (the opposite maybe more
complicated depending on the dataset, because BSON has more primitive
types than JSON).

Where I work [1] we are doing pilot projects with CouchDB, but we also
envision using CouchDB as the main repository for content creation,
and pushing data to MongoDB for high demand services, if we find out
that CouchDB can't handle the traffic.

[1] http://regional.bvsalud.org/php/index.php?lang=en

-- 
Luciano Ramalho
programador repentista || stand-up programmer
Twitter: @luciano


Re: [CODE4LIB] CouchDB and MongoDB (was: Re: O'Reilly books...)

2010-12-14 Thread Nate Vack
Tongue lodged deeply -- so deeply -- in cheek:

http://nosql.mypopescu.com/post/1016320617/mongodb-is-web-scale#

NSFW if your co-workers don't like to hear computer-generated swears.

:)

Cheers,
-Nate

On Tue, Dec 14, 2010 at 11:11 AM, Luciano Ramalho luci...@ramalho.org wrote:
 On Tue, Dec 14, 2010 at 11:58 AM, Bill Dueber b...@dueber.com wrote:
 Oops. I just found a better overview than I can provide, at
 http://www.mongodb.org/display/DOCS/Comparing+Mongo+DB+and+Couch+DB

 I was just about to send that link.

 There are lots of other players in this space, too -- see
 http://nosql-database.org/

 It depends on how you define that space. There are lots of players
 in the non-relational, AKA, NoSQL space, but in the document oriented
 space I don't know of any other current contender other than MongoDB
 and CouchDB. Do you?

 Comparing Riak, Cassandra and MongoDB is like comparing a golf cart, a
 fork lift and a fire engine. They are just too different.

 But i'd say MongoDB and CouchDB belong in the same category, though
 MongoDB is optimized for performance in cluster, deployed in a single
 datacenter, with master-slave replication, and CouchDB is designed for
 easy and reliable distributed deployment with master-master
 replication among nodes that are not always online.

 Their conceptual data model is very similar (JSON and BSON), so it's a
 snap to migrate data from CouchDB to MongoDB (the opposite maybe more
 complicated depending on the dataset, because BSON has more primitive
 types than JSON).

 Where I work [1] we are doing pilot projects with CouchDB, but we also
 envision using CouchDB as the main repository for content creation,
 and pushing data to MongoDB for high demand services, if we find out
 that CouchDB can't handle the traffic.

 [1] http://regional.bvsalud.org/php/index.php?lang=en

 --
 Luciano Ramalho
 programador repentista || stand-up programmer
 Twitter: @luciano



Re: [CODE4LIB] CouchDB and MongoDB (was: Re: O'Reilly books...)

2010-12-14 Thread Luciano Ramalho
On Tue, Dec 14, 2010 at 3:23 PM, Nate Vack njv...@wisc.edu wrote:
 Tongue lodged deeply -- so deeply -- in cheek:

 http://nosql.mypopescu.com/post/1016320617/mongodb-is-web-scale#

Yeah, that is funny, thanks for the link, Nate.

I bet you did not mean any harm, but I hope the joke does not kill the
conversation we had just started on the other thread.

-- 
Luciano Ramalho
programador repentista || stand-up programmer
Twitter: @luciano


Re: [CODE4LIB] CouchDB and MongoDB (was: Re: O'Reilly books...)

2010-12-14 Thread Luciano Ramalho
On Tue, Dec 14, 2010 at 5:57 PM, Tom Keays tomke...@gmail.com wrote:
 I saw this visualization of where the various nosql databases fit on
 the CAP Theorem triangle. CAP says there are three primary concerns
 you must balance when choosing a data management system: Consistency,
 Availability, and Partition tolerance. Furthermore, you can only pick
 2.

 http://blog.nahurst.com/visual-guide-to-nosql-systems

 According to this, Riak, SimpleDB, Cassandra and CouchDB all sit on
 the AP side, whereas MongoDB and BigTable sit on the CP side. Most
 relational databases sit on the CA side.

Very interesting, Tom, thanks for the link.

It is interesting to note that although CouchDB and MongoDB sit on
different sides of the CAP triangle, their data model, from an
application perspective, is very similar.

But the implementation of the data model is very different, as Bill
Dueber mentioned before, with MongoDB doing updates in-place whenever
possible, and aggressively caching writes, both of which increase
update speed but also the risk of a corrupt database in case of a
crash. CouchDB does neither, so updates are much slower, but its data
is always in a consistent state on disk, because it only appends, and
appends are guaranteed to be atomic in posix systems.

-- 
Luciano Ramalho
programador repentista || stand-up programmer
Twitter: @luciano


[CODE4LIB] CouchDB

2009-03-17 Thread phil cryer
Hey all, I just started experimenting with CouchDB the other day, and
it's pretty cool.  With the amount of data the Botanical Heritage
Library (BHL) is carrying, this may be an option for the future.  Does
anyone have any experience with it, or any pointers to a good howto,
or basic setup/usage case?  I appreciate that it's a different
approach to an age old problem, and I can see it working hand in hand
with things like hadoop (hdfs), lucene/solr, etc.

P


Re: [CODE4LIB] CouchDB

2009-03-17 Thread Mark Jordan
Hi Phil,

I'm starting to play with CouchDB myself, mainly as a way of learning about 
schemaless databases. Have you seen the book that is being written, 
http://books.couchdb.org/relax/ ? So far it's got a pretty good set of 
up-and-running instructions and some basic howtos.

Mark

- phil cryer p...@cryer.us wrote:

 Hey all, I just started experimenting with CouchDB the other day, and
 it's pretty cool.  With the amount of data the Botanical Heritage
 Library (BHL) is carrying, this may be an option for the future. 
 Does
 anyone have any experience with it, or any pointers to a good howto,
 or basic setup/usage case?  I appreciate that it's a different
 approach to an age old problem, and I can see it working hand in hand
 with things like hadoop (hdfs), lucene/solr, etc.
 
 P


Re: [CODE4LIB] CouchDB

2009-03-17 Thread John Beppu
On Tue, Mar 17, 2009 at 7:22 AM, phil cryer p...@cryer.us wrote:

 Hey all, I just started experimenting with CouchDB the other day, and
 it's pretty cool.  With the amount of data the Botanical Heritage
 Library (BHL) is carrying, this may be an option for the future.  Does
 anyone have any experience with it, or any pointers to a good howto,
 or basic setup/usage case?  I appreciate that it's a different
 approach to an age old problem, and I can see it working hand in hand
 with things like hadoop (hdfs), lucene/solr, etc.


For full-text search, some experimental work has been done.

There's hypercouch which brings Hyper Estraier and CouchDB together:

http://github.com/davisp/hypercouch/tree/master

There's also couchdb-lucene which uses Lucene for full text search:

http://github.com/rnewson/couchdb-lucene/tree/master

People are still exploring this uncharted land (so to speak).  Querying is
accomplished by hooking up an external service to CouchDB.

http://wiki.apache.org/couchdb/ExternalProcesses

This is basically a process that stays resident, reads requests on STDIN and
sends responses on STDOUT as the wiki page I linked to above describes.

As for indexing, I think the smart way to do it is to follow
couchdb-lucene's example and setup an update_notification script.

--beppu