Thanks to all who responded to this. I went with EPrints, using the
Debian/Ubuntu package pointed out by Thomas and others, and it seems
to be working OK.
On 8 April 2010 16:25, Thomas Krichel kric...@openlib.org wrote:
Mike Taylor writes
I was surprised to find that there seems to be no
Dear code4lib-ers,
during last week (wendesday afternoon) we held the first
code4lib.hu workshop in Debrecen, at the University Library.
The purpose of the meeting was that the library developers,
and library information system's power users meet and talk
each other, on order, that in the future
So let's say (hypothetically, of course) that a colleague tells you he's
considering a NoSQL database like MongoDB or CouchDB, to store a couple
tens of millions of documents, where a document is pretty much an
article citation, abstract, and the location of full text (not the full
text itself).
I personally would vote for: This guy's on the bleeding edge. Personally, I'd
hold off, but it could
work. However, I attended a webinar on MongoDB and apparently the
representative stated that SourceForge has moved to a NoSQL platform using
MongoDB and tested their load with 100x growth and
Depends on the sort of features required, in particular the access
patterns, and the hardware it's going to run on.
In my experience, NoSQL systems (for example apache's Cassandra) have
extremely good distribution properties over multiple machines, much
better than SQL databases. Essentially,
Please excuse cross-postings.
--
Associate Vice President for Library and Information Services at
Wheaton College in Norton, MA
Located between Boston and Providence, Wheaton College is a four-year,
private liberal arts college with 1,550 students. The
Senior Programmer Analyst
Office of Digital Assets and Infrastructure, Yale University
New Haven, CT
( http://tinyurl.com/yyn7dgz )
ODAI is charged with developing a digital information management
strategy for Yale and building digital collections and technical
infrastructure in a coordinated
The advantage of the NoSQL DBs is that they're schema-less which
allows much more flexibility in your data going in.
However, it sounds like your schema may be pretty standardized -- I'm
not sure of a huge advantage (outside the aforementioned replication
functionality) you'd get.
-Ross.
On
I'd opt for the first response. I hope NoSQL is not flash in the pan. It
makes eminent sense to me. SQL is just one way of looking at data. A level of
abstraction. What authority says that SQL is the only or the best way of
looking at a dataset? Or the MARC record format for that matter?
I'd actually vote for the sensible, forward-looking approach. The BBC
(for one) is already using CouchDB in a production:
http://damienkatz.net/2010/03/bbc_and_couchdb.html
That said, NoSQL as a movement is as wide and varied as the RDBMS
world, and there are pros and cons to each. I'm
SQL-style JOINs can be done in CouchDB (can't speak for the other NoSQL
DB's).
In CouchDB, it's called view collation:
http://chrischandler.name/couchdb/view-collation-for-join-like-behavior-in-couchdb/
It's a different way of thinking (as there are no tables, and map/reduce
goes through
The thing is, the NoSQL stuff is pretty much just a key-value store.
There's generally no way to query the store, instead you can simply
look up a document by ID.
If this meets the needs of your application, all you need is a key-value
store, and not any kind of query, then it's definitely
On Mon, 12 Apr 2010, Jonathan Rochkind wrote:
So, as usual, the right tool for the job. If all you really need is a
key-value store on ID, then a NoSQL solution may be the right thing. But
if you need actual querrying and joining, then personally I'd stick with
rdbms unless I had some
On Mon, Apr 12, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
The thing is, the NoSQL stuff is pretty much just a key-value store.
There's generally no way to query the store, instead you can simply look
up a document by ID.
Actually, this depends largely on the NoSQL DBMS in
On Mon, Apr 12, 2010 at 12:22 PM, Jonathan Rochkind rochk...@jhu.eduwrote:
The thing is, the NoSQL stuff is pretty much just a key-value store.
There's generally no way to query the store, instead you can simply look
up a document by ID.
Schemaless != no way to query.
Key-value stores,
Yeah, I may have gotten it completely wrong.
Okay, help this grasshopper (possibly by pointing me to relevant
documentation), what's the difference between document-based and
key-value store? When I've looked at CouchDB before, despite it
describing itself as document based, I haven't been
On Mon, Apr 12, 2010 at 10:55 AM, Thomas Dowling tdowl...@ohiolink.edu wrote:
So let's say (hypothetically, of course) that a colleague tells you he's
considering a NoSQL database like MongoDB or CouchDB, to store a couple
tens of millions of documents, where a document is pretty much an
On Mon, 12 Apr 2010, Ryan Eby wrote:
[trimmed]
But I'm
guessing they've thought about the data and what benefits they would
get out of the backend.
Wow. You obviously don't work with the same folks that I do.
I've been attached to one project for about 16 months now, while the rest
of
From my understanding of key/value stores, one can put documents on the
other side of the key, but any and all parsing/processing of that value
happens outside of the database. In CouchDB, the entire document is
query-able from within map/reduce views. After being querying on, those
keys are
Michael Stonebraker *is* the horse, and yet has pointed pointed out that RDBMSs
aren't always the hammer you're looking for. Next time you use a B-tree or
R-tree (spatial search, anyone?), give him a toast with your favorite beverage.
On 04/12/2010 03:26 PM, Ryan Eby wrote:
As for the colleague, I guess the question is why?...
He's hoping it'll impress the babes. :-)
Seriously (and not to draw the conversation to a close), thanks to all for
their insights.
--
Thomas Dowling
tdowl...@ohiolink.edu
So let's say (hypothetically, of course) that a colleague tells you he's
considering a NoSQL database like MongoDB or CouchDB, to store a couple
tens of millions of documents, where a document is pretty much an
article citation, abstract, and the location of full text (not the full
text
On 4/12/10 4:47 PM, Ryan Eby wrote:
You could put your logs, marc records broken out by fields or
arrays/hashes (types in couchdb) in any of them but the approach each
takes would limit you (or empower you) differently.
Once there's a good marc2json script (and format) out there, it'd be
Couldn't you do MARC - MARCXML - JSON?
-Andrew
On 2010-04-12, at 5:00 PM, Benjamin Young wrote:
On 4/12/10 4:47 PM, Ryan Eby wrote:
You could put your logs, marc records broken out by fields or
arrays/hashes (types in couchdb) in any of them but the approach each
takes would limit you (or
There are at least TWO good marc2json formats, and several open source
scripts at least for Bill Dueber's, no?
Benjamin Young wrote:
On 4/12/10 4:47 PM, Ryan Eby wrote:
You could put your logs, marc records broken out by fields or
arrays/hashes (types in couchdb) in any of them but the
On 4/12/10 5:04 PM, Andrew Hankinson wrote:
Couldn't you do MARC - MARCXML - JSON?
-Andrew
Certainly, but the hard part is knowing what you want MARC to look like
once it's in JSON. XML 2 JSON conversions generally need some love to
make the data meaningful on the JSON side (as
26 matches
Mail list logo