Chetan Mehrotra commented on OAK-4810:

bq. I think default for writing (if not configured explicitly) could still be 

The change can be made anytime. It should not affect any other part much. So 
default value can be simply switched to SHA-256

Once a binary is added by any digest method we do not need the method details 
while doing a read as that would be purely on the basis of id. Still it would 
be good to encode the algo in the id which is passed back to NodeStore

> FileDataStore: support SHA-2
> ----------------------------
>                 Key: OAK-4810
>                 URL: https://issues.apache.org/jira/browse/OAK-4810
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob
>            Reporter: Thomas Mueller
> The FileDataStore currently uses SHA-1, but that algorithm is deprecated. We 
> should support other algorithms as well (mainly SHA-256).
> Migration should be painless (no long downtime). I think default for writing 
> (if not configured explicitly) could still be SHA-1. But when reading, 
> SHA-256 should also be supported (depending on the identifier). That way, the 
> new Oak version for all repositories (in a cluster + shared datastore) can be 
> installed "slowly".
> After all repositories are running with the new Oak version, the 
> configuration for SHA-256 can be enabled. That way, SHA-256 is used for new 
> binaries. Both SHA-1 and SHA-256 are supported for reading.
> One potential downside is deduplication would suffer a bit if a new Blob with 
> same content is added again as digest based match would fail. That can be 
> mitigated by computing 2 types of digest if need arises. The downsides are 
> some additional file operations and CPU, and slower migration to SHA-256.
> Some other open questions: 
> * While we are at it, it might makes senses to additionally support SHA-3 and 
> other algorithms (make it configurable). But the length of the identifier 
> alone might then not be enough information to know what algorithm is used, so 
> maybe add a prefix.
> * The number of subdirectory levels: should we keep it as is, or should we 
> reduce it (for example one level less).

This message was sent by Atlassian JIRA

Reply via email to