I would like to see a basic “native” attachment provider with the limitations 
described in 2), as well as an “object store” provider targeting the S3 API. I 
think the consistency considerations are tractable if you’re comfortable with 
the possibility that attachments could possibly be orphaned in the object store 
in the case of a failed transaction.

I had not considered the “just write them on the file system” provider but 
that’s probably partly my cloud-native blinders. I think the main question 
there is redundancy; I would argue against trying to do any sort of replication 
across local disks. Users who happen to have an NFS-style mount point 
accessible to all the CouchDB nodes could use this option reliably, though.

We should calculate a safe maximum attachment size for the native provider — as 
I understand things the FDB transaction size includes both keys and values, so 
our effective attachment size limit will be smaller.

Adam

> On Feb 28, 2019, at 6:21 AM, Robert Newson <rnew...@apache.org> wrote:
> 
> Hi,
> 
> Yes, I agree we should have a framework like that. Folks should be able to 
> choose S3 or COS (IBM), etc. 
> 
> I am personally on the hook for the implementation for CouchDB and for IBM 
> Cloudant and expect them to be different, so the framework, IMO, is a given. 
> 
> B. 
> 
>> On 28 Feb 2019, at 10:33, Jan Lehnardt <j...@apache.org> wrote:
>> 
>> Thanks for getting this started, Bob!
>> 
>> In fear of derailing this right off the bat, is there a potential 4) 
>> approach where on the CouchDB side there is a way to specify “attachment 
>> backends”, one of which could be 2), but others could be “node local file 
>> storage”*, others could be S3-API compatible, etc?
>> 
>> *a bunch of heavy handwaving about how to ensure consistency and fault 
>> tolerance here.
>> 
>> * * *
>> 
>> My hypothetical 4) could also be a later addition, and we’ll do one of 1-3 
>> first.
>> 
>> 
>> * * *
>> 
>> From 1-3, I think 2 is most pragmatic in terms of keeping desirable 
>> functionality, while limiting it so it can be useful in practice.
>> 
>> I feel strongly about not dropping attachment support. While not ideal in 
>> all cases, it is an extremely useful and reasonably popular feature.
>> 
>> Best
>> Jan
>> —
>> 
>>> On 28. Feb 2019, at 11:22, Robert Newson <rnew...@apache.org> wrote:
>>> 
>>> Hi All,
>>> 
>>> We've not yet discussed attachments in terms of the foundationdb work so 
>>> here's where we do that.
>>> 
>>> Today, CouchDB allows you to store large binary values, stored as a series 
>>> of much smaller chunks. These "attachments" cannot be indexed, they can 
>>> only be sent and received (you can fetch the whole thing or you can fetch 
>>> arbitrary subsets of them).
>>> 
>>> On the FDB side, we have a few constraints. A transaction cannot be more 
>>> than 10MB and cannot take more than 5 seconds.
>>> 
>>> Given that, there are a few paths to attachment support going forward;
>>> 
>>> 1) Drop native attachment support. 
>>> 
>>> I suspect this is not going to be a popular approach but it's worth hearing 
>>> a range of views. Instead of direct attachment support, a user could store 
>>> the URL to the large binary content and could simply fetch that URL 
>>> directly.
>>> 
>>> 2) Write attachments into FDB but with limits.
>>> 
>>> The next simplest is to write the attachments into FDB as a series of 
>>> key/value entries, where the key is {database_name, doc_id, 
>>> attachment_name, 0..N} and the value is a short byte array (say, 16K to 
>>> match current). The 0..N is just a counter such that we can do an fdb range 
>>> get / iterator to retrieve the attachment. An embellishment would restore 
>>> the http Range header options, if we still wanted that (disclaimer: I 
>>> implemented the Range thing many years ago, I'm happy to drop support if no 
>>> one really cares for it in 2019).
>>> 
>>> This would be subject to the 10mb and 5s limit, which is less that you 
>>> _can_ do today with attachments but not, in my opinion, any less that 
>>> people actually do (with some notable outliers like npm in the past).
>>> 
>>> 3) Full functionality
>>> 
>>> This would be the same as today. Attachments of arbitrary size (up to the 
>>> disk capacity of the fdb cluster). It would require some extra cleverness 
>>> to work over multiple txn transactions and in such a way that an aborted 
>>> upload doesn't leave partially uploaded data in fdb forever. I have not sat 
>>> down and designed this yet, hence I would very much like to hear from the 
>>> community as to which of these paths are sufficient.
>>> 
>>> -- 
>>> Robert Samuel Newson
>>> rnew...@apache.org
>> 
>> -- 
>> Professional Support for Apache CouchDB:
>> https://neighbourhood.ie/couchdb-support/
> 

Reply via email to