[ 
https://issues.apache.org/jira/browse/OAK-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16381740#comment-16381740
 ] 

Thomas Mueller commented on OAK-5272:
-------------------------------------

To be able to more easily migrate to other hashing algorithms, and also be able 
to use identifiers that are not content hash, I think it makes sense to further 
extend the API (maybe just the internal API), for example as follows:

{noformat}
/**
 * For this binary, returns the map of all known content hashes, 
  * and CRC codes, together with the hash algorithm used. 
 * This can save an application from having to call getStream() 
 * and calculate the CRC / content hash itself.
 * The returned map can be empty if the implementation would have to calculate 
the values.
 * If not empty, then the map contains one entry for each CRC / content hash 
already calculated.
 * The value is always hex-encoded (lowercase) without spaces.
 * For example, it could return a "CRC32" (if known), "SHA-1" (if known), 
"SHA-256", and so on.
 */
Map<String, String> getKnownContentHashes();
{noformat}

For example Amazon S3 seems to calculate the MD5 and provide that as the ETag. 
While MD5 isn't secure, it can be used in the same way as the CRC32, to say 
whether two binaries are different for sure, or possibly the same.

> Expose BlobStore API to provide information whether blob id is content hashed
> -----------------------------------------------------------------------------
>
>                 Key: OAK-5272
>                 URL: https://issues.apache.org/jira/browse/OAK-5272
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: blob
>            Reporter: Amit Jain
>            Priority: Major
>             Fix For: 1.10
>
>
> As per discussion in OAK-5253 it's better to have some information from the 
> BlobStore(s) whether the blob id can be solely relied upon for comparison.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to