[
https://issues.apache.org/jira/browse/JAMES-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399583#comment-17399583
]
Benoit Tellier commented on JAMES-3150:
---------------------------------------
As describe in https://github.com/apache/james-project/pull/594 we propose a
simpler to implement deduplication method, targeting middle size deployments.
The steps would be:
- 1. Provide a way for BlobStores to list blobIds in a bucket
- 2. Provide an interface to list blob references, and implement it for each
entity referencing blobs
- 3. Provide a GenerationAware BlobId factory - generation solves concurrency
issues and can be generated without synchronisation
- 4. Write a simple algorithm leveraging bloom filters performing the GC.
Limits and tweaks of this approach are defined in the ADR.
Doubts I express regarding the previous (unfinished) development efforts are
also listed there. But please note that I believe the two approach can
co-exist, and that one could be choosing the algorithm she wishes to apply,
might a benevolent person decide to finish the iteration based approach that
had been started but not finished.
> Implement Garbage Colletion for blobs
> -------------------------------------
>
> Key: JAMES-3150
> URL: https://issues.apache.org/jira/browse/JAMES-3150
> Project: James Server
> Issue Type: Improvement
> Components: Blob
> Affects Versions: 3.3.0
> Reporter: Gautier DI FOLCO
> Priority: Major
>
> With the blob store deduplication, dropping a blob in a distributed
> environment is impossible if we want to keep an acceptable concurrency level.
> A Garbage Collector should be created in order to drop old blobs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]