Thomas Mueller created OAK-4810:
-----------------------------------

             Summary: FileDataStore: support SHA-2
                 Key: OAK-4810
                 URL: https://issues.apache.org/jira/browse/OAK-4810
             Project: Jackrabbit Oak
          Issue Type: New Feature
            Reporter: Thomas Mueller


The FileDataStore currently uses SHA-1, but that algorithm is deprecated. We 
should support other algorithms as well (mainly SHA-256).

Migration should be painless (no long downtime). I think default for writing 
(if not configured explicitly) could still be SHA-1. But when reading, SHA-256 
should also be supported (depending on the identifier). That way, the new Oak 
version for all repositories (in a cluster + shared datastore) can be installed 
"slowly".

After all repositories are running with the new Oak version, the configuration 
for SHA-256 can be enabled. That way, SHA-256 is used for new binaries. Both 
SHA-1 and SHA-256 are supported for reading.

One potential downside is deduplication would suffer a bit if a new Blob with 
same content is added again as digest based match would fail. That can be 
mitigated by computing 2 types of digest if need arises. The downsides are some 
additional file operations and CPU, and slower migration to SHA-256.

Some other open questions: 

* While we are at it, it might makes senses to additionally support SHA-3 and 
other algorithms (make it configurable). But the length of the identifier alone 
might then not be enough information to know what algorithm is used, so maybe 
add a prefix.

* The number of subdirectory levels: should we keep it as is, or should we 
reduce it (for example one level less).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to