Have you thought about a two-part solution? You can use Couch for the front end to store the metadata (making it searchable in lots of interesting ways) with a separate data store behind it. Along with the metadata, each CouchDB document would hold a URI that points to the actual file somewhere else. You can even mix-and-match back-ends, including straight HTTP or FTP servers as well as subversion or git. (We started implementing this idea to store various kinds of genomics/genetics/transcriptoimics data before I left M.D. Anderson a couple of years ago. We got far enough to know that it is at least somewhat more than just theoretically possible. It never got finished, however, since after I left there was no one to push hard for it....
  Kevin

On 6/21/2016 4:47 PM, Brad Rhoads wrote:
I'll second that. It didn't work out well for us. It's probably OK for
small, plain text documents. But it didn't work too well with large media
files.
ᐧ

---------------------------
www.maf.org/rhoads
www.ontherhoads.org

On Tue, Jun 21, 2016 at 2:29 PM, Alexander Harm <[email protected]> wrote:

Hello Etay,

npm did that at one point and they have a couple of articles in their blog
that might be of your interest:


http://blog.npmjs.org/post/71267056460/fastly-manta-loggly-and-couchdb-attachments
<
http://blog.npmjs.org/post/71267056460/fastly-manta-loggly-and-couchdb-attachments
http://blog.npmjs.org/post/75707294465/new-npm-registry-architecture <
http://blog.npmjs.org/post/75707294465/new-npm-registry-architecture>

They experienced problems with storing a lot of attachments in CouchDB and
moved to another solution. Also note this post of Nolan Lawson, point 4:


https://pouchdb.com/2014/06/17/12-pro-tips-for-better-code-with-pouchdb.html
<
https://pouchdb.com/2014/06/17/12-pro-tips-for-better-code-with-pouchdb.html
I especially love the quote of Laurie Voss:

"One of the big things that everybody who's spent a lot of time with
databases knows is that you should never put your binaries in the database.
It's a terrible idea. It always goes wrong. I have never met a database in
15 years of which it is not true, and it's definitely not true of CouchDB.
You are taking this thing which is meant to sort and organize data, and
you're giving it binary data, which it can neither sort nor organize. It
can't do anything with that data, other than get really fat.”

My advice: DON’T.

Regards, Alexander

On 21. Jun. 2016, at 21:44, Etay Haun <[email protected]> wrote:

Hi,
Thanks for your answers to my last post. It was very helpful.

We are developing a distributed file system solution and we would like to
base our solution on CouchDB.
We would like to use CouchDB to store the files as attachments  (each
document will include the file and the file meta-data).
We have a few data centers that stores *different* file systems, Although
some of the documents are replicated to other data centers.
We have a few questions regarding possible technical issues.
As mentioned, Part of our possible solution involves using attachments to
store the actual files in couchdb.
1. We couldn't find any information regarding suggested attachment size.
2. Is there an issue with storing large attachments? (up to 2GB per file
-
although most files will be much smaller - few KB or MB)
3. We need to replicate some documents between couch instances including
the attachments, Is this okay?
4. Does CouchDB also stores revisions of attachments?
5. If so, how can we determine the required storage space for an instance
assuming we know what will be the entire system's size?
Our biggest instance will include 20TB of attachments.
6. Are there any possible issues with running the instances on Windows
2012
servers?
Thank you in advance.


Reply via email to