_rev values used to be UUID's and became deterministic to improve replication performance. I can see that there's a theoretical issue where replication could be inhibited, though I question how practical it is given the internal details of _rev calculation.
Remember that the _rev value is derived from the contents of the documents, all the bytes of all attachments and values from previous revisions. Stock MD5 preimage attacks are of of much simpler form (finding a Y such that MD5(Y)=X for some desired X). Also that you would have to arrange for the same number of updates as well, since the number at the front is incremented on each successful update. For switching from MD5 to SHA-1, I say no. If we switch, let's use something contemporary like SHA-256. Better yet, let's wait for the winner of the SHA-3 competition. B. On 15 November 2011 07:57, Jason Smith <[email protected]> wrote: > On Tue, Nov 15, 2011 at 7:34 AM, Alex Besogonov > <[email protected]> wrote: >>>> Now I make a change to 'Doc' at machine A. This creates a new revid >>>> with new md5 hash. >>>> A malicious software somehow learns about this update and creates >>>> another document >>>> on machine B, contriving it so to make the resulting hash to be the >>>> same as on machine A. >>> Before going any further, you must show why we care about the contents >>> of machine B. >>> Why would I log in to machine B if I do not trust B's owner? Why would >>> I clone your Git repository if I do not know you? >> The problem is, MD5 hash depends on _untrusted_ data that external >> processes might put into the database. >> >> For example, imagine that machines A and B use CouchDB to store >> certificates. > > I ask again. > > -- > Iris Couch >
