[CODE4LIB] AW: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
Noo!!! NoSQL is terrible for startup projects ;) http://labs.mudynamics.com/2010/04/01/why-nosql-is-bad-for-startups/ Yes, this one is great :) But i think there are some real issues for companies in using databases. The matured RDBMS technology is backed up by mathematical theories. This does not hold for NoSQL systems as far as i know. Maybe there is no need here, hence NoSQL DBs dont want to support ACID style transactions and schema at all. Having a database schema is crucial for integration of applications, and that is what relational DBs have actually been built for. Their main purpose is not in driving multi-server web-applications dealing with forum-users. http://www.mountainman.com.au/software/history/it2.html Having a data-store setup quickly without the need to think about actual data-structures seems a perfect match to agile, feature-driven application development. Because changing data-structures can be handled in a snap and domain model objects map so easy to documents. RDBMS forces you to have some detailed analysis of your application domain before actual implementing your data-model. Complex relational schemas, once rolled-out, are likely to resist change. But there are approaches on this: http://www.informit.com/store/product.aspx?isbn=032150206X Regards! -Ralf
[CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
So let's say (hypothetically, of course) that a colleague tells you he's considering a NoSQL database like MongoDB or CouchDB, to store a couple tens of millions of documents, where a document is pretty much an article citation, abstract, and the location of full text (not the full text itself). Would your reaction be: That's a sensible, forward-looking approach. Lots of sites are putting lots of data into these databases and they'll only get better. This guy's on the bleeding edge. Personally, I'd hold off, but it could work. Schedule that 2012 re-migration to Oracle or Postgres now. Bwahahahah!!! Or something else? (http://en.wikipedia.org/wiki/NoSQL is a good jumping-in point.) -- Thomas Dowling tdowl...@ohiolink.edu
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
I personally would vote for: This guy's on the bleeding edge. Personally, I'd hold off, but it could work. However, I attended a webinar on MongoDB and apparently the representative stated that SourceForge has moved to a NoSQL platform using MongoDB and tested their load with 100x growth and visits of what they are already seeing and had zero issues with scalability. That's pretty impressive. Oh, it also managed to be more efficient than a traditional RDBMS. Brendon Kozlowski Web Administrator Saratoga Springs Public Library 49 Henry Street Saratoga Springs, NY, 12866 [518] 584-7860 x217 From: Code for Libraries on behalf of Thomas Dowling Sent: Mon 4/12/2010 10:55 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan? So let's say (hypothetically, of course) that a colleague tells you he's considering a NoSQL database like MongoDB or CouchDB, to store a couple tens of millions of documents, where a document is pretty much an article citation, abstract, and the location of full text (not the full text itself). Would your reaction be: That's a sensible, forward-looking approach. Lots of sites are putting lots of data into these databases and they'll only get better. This guy's on the bleeding edge. Personally, I'd hold off, but it could work. Schedule that 2012 re-migration to Oracle or Postgres now. Bwahahahah!!! Or something else? (http://en.wikipedia.org/wiki/NoSQL is a good jumping-in point.) -- Thomas Dowling tdowl...@ohiolink.edu To report this message as spam, offensive, or if you feel you have received this in error, please send e-mail to ab...@sals.edu including the entire contents and subject of the message. It will be reviewed by staff and acted upon appropriately.
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
Depends on the sort of features required, in particular the access patterns, and the hardware it's going to run on. In my experience, NoSQL systems (for example apache's Cassandra) have extremely good distribution properties over multiple machines, much better than SQL databases. Essentially, it's easier to store a bunch of key/values in a distributed fashion, as you don't need to do joins across tables (there aren't any) and eventually consistent systems (such as Cassandra) don't even need to always be internally consistent between nodes. If many concurrent write accesses are required, then NoSQL can also be a good choice, for the same reasons as it's easily distributed. And for the same reasons, it can be much faster than SQL systems with the same data given a data model that fits the access patterns. The flip side is that if later you want to do something that just requires the equivalent of table joins, it has to be done at the application level. This is going to be MUCH MUCH slower and harder than if there was SQL underneath. Rob On Mon, Apr 12, 2010 at 7:55 AM, Thomas Dowling tdowl...@ohiolink.edu wrote: So let's say (hypothetically, of course) that a colleague tells you he's considering a NoSQL database like MongoDB or CouchDB, to store a couple tens of millions of documents, where a document is pretty much an article citation, abstract, and the location of full text (not the full text itself). Would your reaction be: That's a sensible, forward-looking approach. Lots of sites are putting lots of data into these databases and they'll only get better. This guy's on the bleeding edge. Personally, I'd hold off, but it could work. Schedule that 2012 re-migration to Oracle or Postgres now. Bwahahahah!!! Or something else? (http://en.wikipedia.org/wiki/NoSQL is a good jumping-in point.) -- Thomas Dowling tdowl...@ohiolink.edu
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
The advantage of the NoSQL DBs is that they're schema-less which allows much more flexibility in your data going in. However, it sounds like your schema may be pretty standardized -- I'm not sure of a huge advantage (outside the aforementioned replication functionality) you'd get. -Ross. On Mon, Apr 12, 2010 at 10:55 AM, Thomas Dowling tdowl...@ohiolink.edu wrote: So let's say (hypothetically, of course) that a colleague tells you he's considering a NoSQL database like MongoDB or CouchDB, to store a couple tens of millions of documents, where a document is pretty much an article citation, abstract, and the location of full text (not the full text itself). Would your reaction be: That's a sensible, forward-looking approach. Lots of sites are putting lots of data into these databases and they'll only get better. This guy's on the bleeding edge. Personally, I'd hold off, but it could work. Schedule that 2012 re-migration to Oracle or Postgres now. Bwahahahah!!! Or something else? (http://en.wikipedia.org/wiki/NoSQL is a good jumping-in point.) -- Thomas Dowling tdowl...@ohiolink.edu
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
I'd opt for the first response. I hope NoSQL is not flash in the pan. It makes eminent sense to me. SQL is just one way of looking at data. A level of abstraction. What authority says that SQL is the only or the best way of looking at a dataset? Or the MARC record format for that matter? They certainly weren't inscribed on stone tablets. These things can become mind prisons. I think it's refreshing that there are those willing to look at databases beyond SQL. Peter Schlumpf www.avantilibrarysystems.com -Original Message- From: Thomas Dowling tdowl...@ohiolink.edu Sent: Apr 12, 2010 10:55 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan? So let's say (hypothetically, of course) that a colleague tells you he's considering a NoSQL database like MongoDB or CouchDB, to store a couple tens of millions of documents, where a document is pretty much an article citation, abstract, and the location of full text (not the full text itself). Would your reaction be: That's a sensible, forward-looking approach. Lots of sites are putting lots of data into these databases and they'll only get better. This guy's on the bleeding edge. Personally, I'd hold off, but it could work. Schedule that 2012 re-migration to Oracle or Postgres now. Bwahahahah!!! Or something else? (http://en.wikipedia.org/wiki/NoSQL is a good jumping-in point.) -- Thomas Dowling tdowl...@ohiolink.edu
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
I'd actually vote for the sensible, forward-looking approach. The BBC (for one) is already using CouchDB in a production: http://damienkatz.net/2010/03/bbc_and_couchdb.html That said, NoSQL as a movement is as wide and varied as the RDBMS world, and there are pros and cons to each. I'm personally a proponent of CouchDB because it's RESTful API, JSON storage system, and JavaScript (or Erlang, PHP, Python, Ruby, etc) map/reduce view engine. If your project need replication at all (whether for scaling, data sharing, etc), I'd take a good hard look at CouchDB as that's it's core distinction among the other NoSQL databases. Hope that helps, Benjamin -- President BigBlueHat P: 864.232.9553 W: http://www.bigbluehat.com/ http://www.linkedin.com/in/benjaminyoung On 4/12/10 10:55 AM, Thomas Dowling wrote: So let's say (hypothetically, of course) that a colleague tells you he's considering a NoSQL database like MongoDB or CouchDB, to store a couple tens of millions of documents, where a document is pretty much an article citation, abstract, and the location of full text (not the full text itself). Would your reaction be: That's a sensible, forward-looking approach. Lots of sites are putting lots of data into these databases and they'll only get better. This guy's on the bleeding edge. Personally, I'd hold off, but it could work. Schedule that 2012 re-migration to Oracle or Postgres now. Bwahahahah!!! Or something else? (http://en.wikipedia.org/wiki/NoSQL is a good jumping-in point.)
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
SQL-style JOINs can be done in CouchDB (can't speak for the other NoSQL DB's). In CouchDB, it's called view collation: http://chrischandler.name/couchdb/view-collation-for-join-like-behavior-in-couchdb/ It's a different way of thinking (as there are no tables, and map/reduce goes through every document to generate it's output), but it is possible to get interestingly combined data out of the whole database. Later, Benjamin -- President BigBlueHat P: 864.232.9553 W: http://www.bigbluehat.com/ http://www.linkedin.com/in/benjaminyoung On 4/12/10 11:08 AM, Robert Sanderson wrote: Depends on the sort of features required, in particular the access patterns, and the hardware it's going to run on. In my experience, NoSQL systems (for example apache's Cassandra) have extremely good distribution properties over multiple machines, much better than SQL databases. Essentially, it's easier to store a bunch of key/values in a distributed fashion, as you don't need to do joins across tables (there aren't any) and eventually consistent systems (such as Cassandra) don't even need to always be internally consistent between nodes. If many concurrent write accesses are required, then NoSQL can also be a good choice, for the same reasons as it's easily distributed. And for the same reasons, it can be much faster than SQL systems with the same data given a data model that fits the access patterns. The flip side is that if later you want to do something that just requires the equivalent of table joins, it has to be done at the application level. This is going to be MUCH MUCH slower and harder than if there was SQL underneath. Rob On Mon, Apr 12, 2010 at 7:55 AM, Thomas Dowlingtdowl...@ohiolink.edu wrote: So let's say (hypothetically, of course) that a colleague tells you he's considering a NoSQL database like MongoDB or CouchDB, to store a couple tens of millions of documents, where a document is pretty much an article citation, abstract, and the location of full text (not the full text itself). Would your reaction be: That's a sensible, forward-looking approach. Lots of sites are putting lots of data into these databases and they'll only get better. This guy's on the bleeding edge. Personally, I'd hold off, but it could work. Schedule that 2012 re-migration to Oracle or Postgres now. Bwahahahah!!! Or something else? (http://en.wikipedia.org/wiki/NoSQL is a good jumping-in point.) -- Thomas Dowling tdowl...@ohiolink.edu
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
The thing is, the NoSQL stuff is pretty much just a key-value store. There's generally no way to query the store, instead you can simply look up a document by ID. If this meets the needs of your application, all you need is a key-value store, and not any kind of query, then it's definitely going to be a lot less overhead than an actual SQL rdbms, and simpler to manage, with advantages for scalability and replication etc. The reason it's simpler and more performant, is well, because it's _simpler_, you don't actually have querrying or joining abilities. But if you are actually going to need querrying on values other than ID... SQL rdbms is a pretty standardized, well understood way to do this. There are certainly other ways -- you could combine a noSQL key-value store with Solr/Lucene, for instance. Which in some cases may get you even better performance and more flexiblity than an rdbms solution. But it's (IMO) going to be a bit harder to set up and manage and use in your favorite development environment, precisely because rdbms is such a time-tested standardized mature approach. So, as usual, the right tool for the job. If all you really need is a key-value store on ID, then a NoSQL solution may be the right thing. But if you need actual querrying and joining, then personally I'd stick with rdbms unless I had some concrete reason to think a more complicated nosql+solr solution was required. Certainly if you are planning on using Solr _anyway_ because your application is a search engine of some type, that would lessen the incremental 'cost' of a nosql+solr solution. [ Note that if all you want is a schemaless storage, you CAN just stick large chunks of binary or text in an rdbms 'blob' or 'text' column. You won't be able to efficiently search on these -- but you aren't able to efficiently search in a 'nosql' solution either. So you _can_ use an rdbms like a nosql solution to store arbitrary data, no problem. If you're using an rdbms, you can have _other_ columns in addition to your blob/text one, that you can populate for select and join. If you _aren't_ going to need those -- then there's be no reason to do it in an rdbms (even though you could), you would indeed then just want to use a 'nosql' key-value store solution which will be higher performance. So the conclusion again I think is that rdbms is _more powerful_ than nosql, but that power comes with a performance cost. If you don't need it, nosql. If you do need it -- there's no reason you can't store structureless units of data in text/blob in an rdbms too. ] Peter Schlumpf wrote: I'd opt for the first response. I hope NoSQL is not flash in the pan. It makes eminent sense to me. SQL is just one way of looking at data. A level of abstraction. What authority says that SQL is the only or the best way of looking at a dataset? Or the MARC record format for that matter? They certainly weren't inscribed on stone tablets. These things can become mind prisons. I think it's refreshing that there are those willing to look at databases beyond SQL. Peter Schlumpf www.avantilibrarysystems.com -Original Message- From: Thomas Dowling tdowl...@ohiolink.edu Sent: Apr 12, 2010 10:55 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan? So let's say (hypothetically, of course) that a colleague tells you he's considering a NoSQL database like MongoDB or CouchDB, to store a couple tens of millions of documents, where a document is pretty much an article citation, abstract, and the location of full text (not the full text itself). Would your reaction be: That's a sensible, forward-looking approach. Lots of sites are putting lots of data into these databases and they'll only get better. This guy's on the bleeding edge. Personally, I'd hold off, but it could work. Schedule that 2012 re-migration to Oracle or Postgres now. Bwahahahah!!! Or something else? (http://en.wikipedia.org/wiki/NoSQL is a good jumping-in point.) -- Thomas Dowling tdowl...@ohiolink.edu
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
On Mon, 12 Apr 2010, Jonathan Rochkind wrote: So, as usual, the right tool for the job. If all you really need is a key-value store on ID, then a NoSQL solution may be the right thing. But if you need actual querrying and joining, then personally I'd stick with rdbms unless I had some concrete reason to think a more complicated nosql+solr solution was required. Certainly if you are planning on using Solr _anyway_ because your application is a search engine of some type, that would lessen the incremental 'cost' of a nosql+solr solution. I'm surprised that I keep hearing so much about NoSQL for key-value stores, and everyone seems to forget the *old* key-value stores, such as directory services (X.500 and LDAP, although that's actually the protocol used to query them, not the storage implementation). Yes, there are things that LDAP doesn't do so well (relationships being one of them), but it supports querying, you can adjust the matching by attribute (ie, this one's matched as a number, this one's matched as a string, this one's a case insensitive string ... I think some implementations have functionality to run the search term through a functions for things like soundex, so it might be possible add hooks for stemming and query expansion, etc.) I think that NoSQL got a lot of press because of Google having used it (and their having a *VERY* large data system -- but not everyone has that large of a system; also, Google did it 10+ years ago -- you can now through a lot more CPU and RAM at an RDBMS, so the point at which the database becomes a problem isn't the same as it was when Google first came out.) ... So, I think that there are cases where NoSQL is the right solution for the job, and I think there are times when an DRBMS is the right solution ... there are also plenty of times for flat file databases, XML, LDAP, and a slew of other storage standards. -Joe hmm ... now I'm going to have to try to bring back my attempt to put my catalogs into a directory service ... I have a feeling I'm going to run into issues with unit conversions when searching.
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
On Mon, Apr 12, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The thing is, the NoSQL stuff is pretty much just a key-value store. There's generally no way to query the store, instead you can simply look up a document by ID. Actually, this depends largely on the NoSQL DBMS in question. Some are key value stores (Redis, Tokyo Cabinet, Cassandra), some are document-based (CouchDB, MongoDB), some are graph-based (Neo4J), so I think blanket statements like this are somewhat misleading. CouchDB and MongoDB (for example) have the capacity to index the values within the document - you don't just have to look up things by document ID. -Ross.
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
On Mon, Apr 12, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.eduwrote: The thing is, the NoSQL stuff is pretty much just a key-value store. There's generally no way to query the store, instead you can simply look up a document by ID. Schemaless != no way to query. Key-value stores, like memcache, are just one end of what most consider the nosql spectrum. For instance, I can query my CouchDB instances through the different views I create. I thought this blog post had an interesting take on NoSQL, although this guy, Mike Stonebreaker of VoltDB, obviously has a horse in the race. http://cacm.acm.org/blogs/blog-cacm/50678-the-nosql-discussion-has-nothing-to-do-with-sql/fulltext --jay
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
Yeah, I may have gotten it completely wrong. Okay, help this grasshopper (possibly by pointing me to relevant documentation), what's the difference between document-based and key-value store? When I've looked at CouchDB before, despite it describing itself as document based, I haven't been able to tell what the difference is between it and a key value store. It seemed to support storing a document by key, and retrieving it by key. It didn't seem to _do_ anything special with the document other than storing it there (maybe it DOES, but I missed it?). So you can call it a document instead of a value, but I couldn't figure out how that differed from a key-value store. I guess it's that CouchDB _does_ let you build indexes on values other than the key? Wacky, wonder how I missed that when I reviewed it last. Jonathan Ross Singer wrote: On Mon, Apr 12, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The thing is, the NoSQL stuff is pretty much just a key-value store. There's generally no way to query the store, instead you can simply look up a document by ID. Actually, this depends largely on the NoSQL DBMS in question. Some are key value stores (Redis, Tokyo Cabinet, Cassandra), some are document-based (CouchDB, MongoDB), some are graph-based (Neo4J), so I think blanket statements like this are somewhat misleading. CouchDB and MongoDB (for example) have the capacity to index the values within the document - you don't just have to look up things by document ID. -Ross.
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
On Mon, Apr 12, 2010 at 10:55 AM, Thomas Dowling tdowl...@ohiolink.edu wrote: So let's say (hypothetically, of course) that a colleague tells you he's considering a NoSQL database like MongoDB or CouchDB, to store a couple tens of millions of documents, where a document is pretty much an article citation, abstract, and the location of full text (not the full text itself). Would your reaction be: There's really two reactions in here. One about NoSQL and the other about your colleague. As for NoSQL i would be on the side that the ecosystem is here to stay although individual projects may or may not take off/evolve. The best description I've seen about nosql as a whole is choice[1]. Not having to shove everything in a similar style database for every project and making the database fit the data/use. Theres a large number of projects now, each with their own priorities and the trade-offs they've made to reach them. Some care about consistency, others eventual consistency is good enough and others go as far as distributed transactions over nodes. Some do lazy writes to disk, others not. How you query your data also varies quite a bit with sql-like, map/reduce, hadoop, etc. From your brief description it sounds like quite a few projects could fit the bill, including rdbms-types, and which one you want would probably depend on what you think you might do in the future. If you foresee yourself having lots of fields that might only cover certain subsets of the dataset then couchdb or the like are probably worth looking at. As for the colleague, I guess the question is why? If it is because of trendiness then Bwahahahah!!! might be the best answer. But I'm guessing they've thought about the data and what benefits they would get out of the backend. [1] http://blog.couch.io/post/511008668/nosql-is-about
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
On Mon, 12 Apr 2010, Ryan Eby wrote: [trimmed] But I'm guessing they've thought about the data and what benefits they would get out of the backend. Wow. You obviously don't work with the same folks that I do. I've been attached to one project for about 16 months now, while the rest of the team's been together for 4 years ... I've been trying to get a few changes made to better support my user community (basically, all of the people who don't have access to their system, or don't want to spend the 6 months using the system 'to be able to do something almost useful'. About 2-3 months ago, the main project team finally realized that they have *no*idea* what the user community wants or needs. Oh, and they have to go live on April 21st. I'm expecting a major 'wtf?' reaction from the majority of the community. -Joe
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
From my understanding of key/value stores, one can put documents on the other side of the key, but any and all parsing/processing of that value happens outside of the database. In CouchDB, the entire document is query-able from within map/reduce views. After being querying on, those keys are indexed for faster future queries. So, in that way, CouchDB jumps over the key/value limitations and becomes a document database. In addition to map/reduce output, there's also a handy _update system that can be used to validate a JSON document prior to it's insertion in the database--again, something not possible with key/value storage. You can, though, use CouchDB in a key/value fashion by storing binary data (or HTML, XML, RDF, etc) as attachments or JSON encoded strings (where possible). In that case, you would just be retrieving them by id (or URL), but you could store all kinds of ad hoc metadata about those attachments and use those to query with later. Also, the blog article Ryan Eby just posted, is a great (and quick) overview of the varied noSQL ecosystem. In many ways, these systems are as different as they are similar. Hope you (re)search goes well, Benjamin -- President BigBlueHat P: 864.232.9553 W: http://www.bigbluehat.com/ http://www.linkedin.com/in/benjaminyoung On 4/12/10 2:42 PM, Jonathan Rochkind wrote: Yeah, I may have gotten it completely wrong. Okay, help this grasshopper (possibly by pointing me to relevant documentation), what's the difference between document-based and key-value store? When I've looked at CouchDB before, despite it describing itself as document based, I haven't been able to tell what the difference is between it and a key value store. It seemed to support storing a document by key, and retrieving it by key. It didn't seem to _do_ anything special with the document other than storing it there (maybe it DOES, but I missed it?). So you can call it a document instead of a value, but I couldn't figure out how that differed from a key-value store. I guess it's that CouchDB _does_ let you build indexes on values other than the key? Wacky, wonder how I missed that when I reviewed it last. Jonathan Ross Singer wrote: On Mon, Apr 12, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.edu wrote: The thing is, the NoSQL stuff is pretty much just a key-value store. There's generally no way to query the store, instead you can simply look up a document by ID. Actually, this depends largely on the NoSQL DBMS in question. Some are key value stores (Redis, Tokyo Cabinet, Cassandra), some are document-based (CouchDB, MongoDB), some are graph-based (Neo4J), so I think blanket statements like this are somewhat misleading. CouchDB and MongoDB (for example) have the capacity to index the values within the document - you don't just have to look up things by document ID. -Ross.
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
Michael Stonebraker *is* the horse, and yet has pointed pointed out that RDBMSs aren't always the hammer you're looking for. Next time you use a B-tree or R-tree (spatial search, anyone?), give him a toast with your favorite beverage. http://cacm.acm.org/blogs/blog-cacm/32212-the-end-of-a-dbms-era-might-be-upon-us/fulltext http://en.wikipedia.org/wiki/Michael_Stonebraker -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Jay Luker Sent: Monday, April 12, 2010 10:38 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan? On Mon, Apr 12, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.eduwrote: The thing is, the NoSQL stuff is pretty much just a key-value store. There's generally no way to query the store, instead you can simply look up a document by ID. Schemaless != no way to query. Key-value stores, like memcache, are just one end of what most consider the nosql spectrum. For instance, I can query my CouchDB instances through the different views I create. I thought this blog post had an interesting take on NoSQL, although this guy, Mike Stonebreaker of VoltDB, obviously has a horse in the race. http://cacm.acm.org/blogs/blog-cacm/50678-the-nosql-discussion-has-nothing-to-do-with-sql/fulltext --jay
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
On 04/12/2010 03:26 PM, Ryan Eby wrote: As for the colleague, I guess the question is why?... He's hoping it'll impress the babes. :-) Seriously (and not to draw the conversation to a close), thanks to all for their insights. -- Thomas Dowling tdowl...@ohiolink.edu
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
So let's say (hypothetically, of course) that a colleague tells you he's considering a NoSQL database like MongoDB or CouchDB, to store a couple tens of millions of documents, where a document is pretty much an article citation, abstract, and the location of full text (not the full text itself). Would your reaction be: Noo!!! NoSQL is terrible for startup projects ;) http://labs.mudynamics.com/2010/04/01/why-nosql-is-bad-for-startups/ But seriously, it depends. You know, a lotta ins, lotta outs, lotta what-have-yous. I sort of like MongoDB's characterization of the landscape as tradeoffs between scale performance on the one hand and depth of functionality on the other: http://www.mongodb.org/display/DOCS/Philosophy I suspect we'll continue to see more hybrid systems for some time to come with various data stores handling the pieces they do best.
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
On 4/12/10 4:47 PM, Ryan Eby wrote: You could put your logs, marc records broken out by fields or arrays/hashes (types in couchdb) in any of them but the approach each takes would limit you (or empower you) differently. Once there's a good marc2json script (and format) out there, it'd be grand to see marc records dumped into CouchDB to allow them to be replicated between groups of librarians (and even up to OpenLibrary). I'm still up for helping make that possible if anyone's into that. :)
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
Couldn't you do MARC - MARCXML - JSON? -Andrew On 2010-04-12, at 5:00 PM, Benjamin Young wrote: On 4/12/10 4:47 PM, Ryan Eby wrote: You could put your logs, marc records broken out by fields or arrays/hashes (types in couchdb) in any of them but the approach each takes would limit you (or empower you) differently. Once there's a good marc2json script (and format) out there, it'd be grand to see marc records dumped into CouchDB to allow them to be replicated between groups of librarians (and even up to OpenLibrary). I'm still up for helping make that possible if anyone's into that. :)
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
There are at least TWO good marc2json formats, and several open source scripts at least for Bill Dueber's, no? Benjamin Young wrote: On 4/12/10 4:47 PM, Ryan Eby wrote: You could put your logs, marc records broken out by fields or arrays/hashes (types in couchdb) in any of them but the approach each takes would limit you (or empower you) differently. Once there's a good marc2json script (and format) out there, it'd be grand to see marc records dumped into CouchDB to allow them to be replicated between groups of librarians (and even up to OpenLibrary). I'm still up for helping make that possible if anyone's into that. :)
Re: [CODE4LIB] NoSQL - is this a real thing or a flash in the pan?
On 4/12/10 5:04 PM, Andrew Hankinson wrote: Couldn't you do MARC - MARCXML - JSON? -Andrew Certainly, but the hard part is knowing what you want MARC to look like once it's in JSON. XML 2 JSON conversions generally need some love to make the data meaningful on the JSON side (as attributes and such make a 1-to-1 conversion complicated--though there have been attempts at general conversion scripts). Once a JSON output format for MARC is done, then converting from MARCXML to marc.json (or whatever) would be an easy first step. On 2010-04-12, at 5:00 PM, Benjamin Young wrote: On 4/12/10 4:47 PM, Ryan Eby wrote: You could put your logs, marc records broken out by fields or arrays/hashes (types in couchdb) in any of them but the approach each takes would limit you (or empower you) differently. Once there's a good marc2json script (and format) out there, it'd be grand to see marc records dumped into CouchDB to allow them to be replicated between groups of librarians (and even up to OpenLibrary). I'm still up for helping make that possible if anyone's into that. :)