Hi, Yes, I agree we should have a framework like that. Folks should be able to choose S3 or COS (IBM), etc.
I am personally on the hook for the implementation for CouchDB and for IBM Cloudant and expect them to be different, so the framework, IMO, is a given. B. > On 28 Feb 2019, at 10:33, Jan Lehnardt <j...@apache.org> wrote: > > Thanks for getting this started, Bob! > > In fear of derailing this right off the bat, is there a potential 4) approach > where on the CouchDB side there is a way to specify “attachment backends”, > one of which could be 2), but others could be “node local file storage”*, > others could be S3-API compatible, etc? > > *a bunch of heavy handwaving about how to ensure consistency and fault > tolerance here. > > * * * > > My hypothetical 4) could also be a later addition, and we’ll do one of 1-3 > first. > > > * * * > > From 1-3, I think 2 is most pragmatic in terms of keeping desirable > functionality, while limiting it so it can be useful in practice. > > I feel strongly about not dropping attachment support. While not ideal in all > cases, it is an extremely useful and reasonably popular feature. > > Best > Jan > — > >> On 28. Feb 2019, at 11:22, Robert Newson <rnew...@apache.org> wrote: >> >> Hi All, >> >> We've not yet discussed attachments in terms of the foundationdb work so >> here's where we do that. >> >> Today, CouchDB allows you to store large binary values, stored as a series >> of much smaller chunks. These "attachments" cannot be indexed, they can only >> be sent and received (you can fetch the whole thing or you can fetch >> arbitrary subsets of them). >> >> On the FDB side, we have a few constraints. A transaction cannot be more >> than 10MB and cannot take more than 5 seconds. >> >> Given that, there are a few paths to attachment support going forward; >> >> 1) Drop native attachment support. >> >> I suspect this is not going to be a popular approach but it's worth hearing >> a range of views. Instead of direct attachment support, a user could store >> the URL to the large binary content and could simply fetch that URL directly. >> >> 2) Write attachments into FDB but with limits. >> >> The next simplest is to write the attachments into FDB as a series of >> key/value entries, where the key is {database_name, doc_id, attachment_name, >> 0..N} and the value is a short byte array (say, 16K to match current). The >> 0..N is just a counter such that we can do an fdb range get / iterator to >> retrieve the attachment. An embellishment would restore the http Range >> header options, if we still wanted that (disclaimer: I implemented the Range >> thing many years ago, I'm happy to drop support if no one really cares for >> it in 2019). >> >> This would be subject to the 10mb and 5s limit, which is less that you _can_ >> do today with attachments but not, in my opinion, any less that people >> actually do (with some notable outliers like npm in the past). >> >> 3) Full functionality >> >> This would be the same as today. Attachments of arbitrary size (up to the >> disk capacity of the fdb cluster). It would require some extra cleverness to >> work over multiple txn transactions and in such a way that an aborted upload >> doesn't leave partially uploaded data in fdb forever. I have not sat down >> and designed this yet, hence I would very much like to hear from the >> community as to which of these paths are sufficient. >> >> -- >> Robert Samuel Newson >> rnew...@apache.org > > -- > Professional Support for Apache CouchDB: > https://neighbourhood.ie/couchdb-support/ >