[
https://issues.apache.org/jira/browse/OAK-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355158#comment-16355158
]
Thomas Mueller commented on OAK-5272:
-------------------------------------
[~amitjain] what about the case where one blob has a SHA-1 content hash, and
the other has a SHA-256 content hash?
The content hash is different, but the content could still be the same.
> currently the BlobStore(s) are not aware of the Blob object.
Are blobs aware of the blob store? If yes, what about adding a method "compare
content" to the blob, something like this:
{noformat}
public enum Equality { EQUALS, DIFFERENT, UNKNOWN };
public Equality compareContent(Blob other) {
if (this == other) {
return Equality.EQUALS;
} else if (other == null) {
return Equality.DIFFERENT;
}
if (length() != other.length()) {
return Equality.DIFFERENT;
}
if (!blobStore.hasContentAdressableBlobIds()) {
return Equality.UNKNOWN;
}
// TODO is strict type check needed, or is "instanceof" sufficient?
if (other.getClass() == getClass()) {
BlobStoreBlob otherBlob = (BlobStoreBlob) other;
if (!otherBlob.blobStore.hasContentAdressableBlobIds()) {
return Equality.UNKNOWN;
}
// TODO maybe blobId contains the length? in this case, truncate
that part
if (otherBlob.blobId.length() != blobId.length()) {
return Equality.UNKNOWN;
}
return blobId.equals(otherBlob.blobId) ? Equality.EQUALS :
Equality.DIFFERENT;
}
return Equality.UNKNOWN;
}
{noformat}
I know, many cases...
Your method is still needed, but we would need to extend the description a bit:
{noformat}
/**
*
* Will return true if blob ids are generated from content hash.
* Content hashes of the same length can be used for equality checks
* (content hashes of different length are generated with different
algorithms).
*
* @return true if blobs are content addressable
*/
boolean hasContentAdressableBlobIds();
{noformat}
> Expose BlobStore API to provide information whether blob id is content hashed
> -----------------------------------------------------------------------------
>
> Key: OAK-5272
> URL: https://issues.apache.org/jira/browse/OAK-5272
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: blob
> Reporter: Amit Jain
> Priority: Major
>
> As per discussion in OAK-5253 it's better to have some information from the
> BlobStore(s) whether the blob id can be solely relied upon for comparison.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)