[ 
https://issues.apache.org/jira/browse/COUCHDB-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037277#comment-13037277
 ] 

Randall Leeds commented on COUCHDB-687:
---------------------------------------

+1 as well for having the digest_type in the metadata.

I want to suggest that we also include the encoding/compression in the metadata 
about an attachment. This information has implications for deterministic 
revision generation since the md5 of attachments on disk is used in the 
generation of revisions.

For purity of API and transparency of revision generation I think it's 
unacceptable for CouchDB to compute a revision ID based on the digest of an 
attachment which was compressed server-side. A client needs to expect that an 
attachment uploaded to identical documents in two different couches using the 
same encoding should result in the same revision. If a client wants to 
pre-compress an attachment to upload, and inform couch of the compression using 
headers, it should be fine for couch to calculate a digest using the compressed 
version (and use that in the revision generation) as long as the 
compression/encoding format on which the digest is based is exposed as well.

tl;dr -- CouchDB needs to be transparent in how it's creating revision 
identifiers. It should NEVER use a digest generated *after* server-side 
compression to calculate a revision hash. It MUST calculate the revision from 
the data _as provided_ by the client. These are the considerations I have 
approaching this patch.

> Add md5 hash to _attachments properties for documents
> -----------------------------------------------------
>
>                 Key: COUCHDB-687
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-687
>             Project: CouchDB
>          Issue Type: Improvement
>         Environment: CouchDB
>            Reporter: mikeal
>            Assignee: Filipe Manana
>         Attachments: couchdb-md5-in-attachment-COUCHDB-687-v2.patch, 
> couchdb-md5-in-attachment-COUCHDB-687-v3.patch, 
> couchdb-md5-in-attachment-COUCHDB-687.patch, md5.patch
>
>
> The current attachment information looks like this:
> GET /dbname/docid
> "_attachments": {
>       "jquery-1.4.1.min.js": {
>           "content_type": "text/javascript"
>           "revpos": 138
>           "length": 70844
>           "stub": true
>       }
> }
> If a client wanted to sync local files as attachments with a document it 
> would not currently be able to do so without keeping a local store of the 
> revpos. If this information included an md5 hash of the attachment clients 
> could compare it against a hash of the local file to see if they match.
> -Mikeal

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to