Thanks very much, Chris, I greatly appreciate your insight. I'll keep you informed on how things work out.
---Peter On Tue, Nov 25, 2008 at 1:53 PM, Chris Anderson <[EMAIL PROTECTED]> wrote: > On Mon, Nov 24, 2008 at 9:24 AM, Peter Herndon <[EMAIL PROTECTED]> wrote: > >> >> Anyway, that's my current use case, and my next use case. I know that >> CouchDB isn't finished yet, and hasn't been optimized yet, but does >> anyone have any opinions on whether CouchDB would be a reasonable fit >> for managing the metadata associated with each object? > > I think CouchDB is pretty much design with this use case in mind. If > you were lucky enough to convince the organization to switch from XML > to JSON, the software would pretty much write itself. And CouchDB does > a fairly decent job of dealing in XML, as well (using Spidermonkey's > E4X engine) so that's not even required. > >> And, likewise, >> would CouchDB be a reasonable fit for managing the binary datastreams? >> Would it be practical to store the datastreams in CouchDB itself, and >> up to what size limit/throughput limit? > > CouchDB's attachment support is pretty much designed for this use case > (attachments can be multi-GB files, and aren't sent to view servers). > From your description, it sounds like you are maxing out IO at the > network level, so it's hard to say how CouchDB would interact with > such a stream, without seeing it in action. However, CouchDB's > replication and distribution capabilities should make managing > multi-site projects as simple as one can hope for. If you shard > projects as databases, then you can use replication to make them > available on the local network for the various sites, which should > make it easier to avoid load bottlenecks at a central repository. > >> Would it be better to store >> the datastreams externally and use CouchDB to manage the metadata and >> access control? > > It's not clear - obviously importing TBs of data from a filesystem to > CouchDB will take time and expense, even if CouchDB handles it > swimmingly. The nice thing about the schemaless documents is that you > can be flexible going forward, maybe referencing some assets via URIs > and storing others as attachments. > > Also, looking down the road, are there plans for >> CouchDB's development that would improve its fitness for this purpose >> in the future? >> > > Your project sounds like a good fit for CouchDB. Of course, you are > talking about working on the high end of the performance / scalability > curve, and CouchDB is relatively new, so you'll have to be comfortable > as a trail-blazer (not that you'd be the only one, but with a new > technology, you'll be in a smaller crowd than if you used something > that's been around longer.) > > I think the biggest positive reason to use CouchDB for your project is > the easy of federation / distribution / offline work. Once you've > built the business-rules and document format around your project and > CouchDB, booting up other instances of the project for more media > collections should be straightforward. Because the documents will be > more self-contained that what you'd have with a SQL store, for > instance, you could build something amenable to merging multiple > repositories, or splitting off just a portion of a repository for a > particular purpose. This flexibility seems like a big win, as it would > allow you to respond to things like datacenter-level bottlenecks with > changes that users will understand, such as moving just the necessary > sub-collections to a more local server. > > Good luck and keep us up to date with your progress. > > Chris > > -- > Chris Anderson > http://jchris.mfdz.com >