The thing is, the NoSQL stuff is pretty much just a key-value store. There's generally no way to "query" the store, instead you can simply look up a document by ID.

If this meets the needs of your application, all you need is a key-value store, and not any kind of query, then it's definitely going to be a lot less overhead than an actual SQL rdbms, and simpler to manage, with advantages for scalability and replication etc. The reason it's simpler and more performant, is well, because it's _simpler_, you don't actually have querrying or joining abilities.

But if you are actually going to need querrying on values other than ID... SQL rdbms is a pretty standardized, well understood way to do this. There are certainly other ways -- you could combine a "noSQL" key-value store with Solr/Lucene, for instance. Which in some cases may get you even better performance and more flexiblity than an rdbms solution. But it's (IMO) going to be a bit harder to set up and manage and use in your favorite development environment, precisely because rdbms is such a time-tested standardized mature approach. So, as usual, the right tool for the job. If all you really need is a key-value store on ID, then a "NoSQL" solution may be the right thing. But if you need actual querrying and joining, then personally I'd stick with rdbms unless I had some concrete reason to think a more complicated "nosql"+solr solution was required. Certainly if you are planning on using Solr _anyway_ because your application is a search engine of some type, that would lessen the incremental 'cost' of a nosql+solr solution.

[ Note that if all you want is a "schemaless" storage, you CAN just stick large chunks of binary or text in an rdbms 'blob' or 'text' column. You won't be able to efficiently search on these -- but you aren't able to efficiently search in a 'nosql' solution either. So you _can_ use an rdbms like a "nosql" solution to store arbitrary data, no problem. If you're using an rdbms, you can have _other_ columns in addition to your blob/text one, that you can populate for select and join. If you _aren't_ going to need those -- then there's be no reason to do it in an rdbms (even though you could), you would indeed then just want to use a 'nosql' key-value store solution which will be higher performance. So the conclusion again I think is that rdbms is _more powerful_ than nosql, but that power comes with a performance cost. If you don't need it, nosql. If you do need it -- there's no reason you can't store "structureless" units of data in text/blob in an rdbms too. ]

Peter Schlumpf wrote:
I'd opt for the first response.  I hope NoSQL is not flash in the pan.  It 
makes eminent sense to me.  SQL is just one way of looking at data.  A level of 
abstraction.  What authority says that SQL is the only or the best way of 
looking at a dataset?  Or the MARC record format for that matter?  They 
certainly weren't inscribed on stone tablets.   These things can become mind 
prisons.  I think it's refreshing that there are those willing to look at 
databases beyond SQL.

Peter Schlumpf
www.avantilibrarysystems.com


-----Original Message-----
From: Thomas Dowling <[email protected]>
Sent: Apr 12, 2010 10:55 AM
To: [email protected]
Subject: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?

So let's say (hypothetically, of course) that a colleague tells you he's
considering a NoSQL database like MongoDB or CouchDB, to store a couple
tens of millions of "documents", where a document is pretty much an
article citation, abstract, and the location of full text (not the full
text itself).  Would your reaction be:

"That's a sensible, forward-looking approach.  Lots of sites are putting
lots of data into these databases and they'll only get better."

"This guy's on the bleeding edge.  Personally, I'd hold off, but it could
work."

"Schedule that 2012 re-migration to Oracle or Postgres now."

"Bwahahahah!!!"

Or something else?



(<http://en.wikipedia.org/wiki/NoSQL> is a good jumping-in point.)


--
Thomas Dowling
[email protected]

Reply via email to