[ https://issues.apache.org/jira/browse/JAMES-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315415#comment-17315415 ]
Benoit Tellier commented on JAMES-3544: --------------------------------------- > There is an API proposal for blob deletion: > https://tools.ietf.org/html/draft-gondwana-jmap-blob-01 I am watching this draft, when a bit more formal it is definitely nice to have. However I do believe it is not enough. Pushing such privacy concern to a concerns seems a weak way to enforce it, hence I did not mention it here. > JMAP uploaded blobs are never deleted > ------------------------------------- > > Key: JAMES-3544 > URL: https://issues.apache.org/jira/browse/JAMES-3544 > Project: James Server > Issue Type: Sub-task > Components: Blob, JMAP > Affects Versions: 3.6.0 > Reporter: Benoit Tellier > Assignee: Antoine Duprat > Priority: Major > > This is a concern both to privacy and cost control (as one need to pay for > storage). > JMAP deploys no method to delete uploaded blobs (maybe I could propose > something on the IETF) > https://jmap.io/spec-core.html#uploading-binary-data suggest that the server > might decide to delete the data. > {code:java} > Under rare circumstances, the server may have deleted the blob before the > client uses it; > the client should keep a reference to the local file so it can upload it > again in such a situation. > {code} > *Root cause of the issue* > We rely on the AttachmentManager for uploads - which is inherited from JMAP > draft. > Attachment manager uses the following fallback right mechanism: > - First see if the user accessing content is holding a message referencing > that attachment > - If not, second, check if he did upload that attachment. > AttachmentManager holds some data referenced by user messages, thus automatic > deletion without a clear separation of concepts looks scary... > *How:* > We should deprecate the following AttachmentMapper methods (and underlying > storage code) - and simplify AttachmentManager code accordingly: > {code:java} > public interface AttachmentMapper extends Mapper { > // to be deprecated > Publisher<AttachmentMetadata> storeAttachmentForOwner(ContentType > contentType, InputStream attachmentContent, Username owner); > Collection<Username> getOwners(AttachmentId attachmentId) throws > MailboxException; > } > {code} > We should write an UploadedContentRepository, holding only the content, the > content-type, the owner and the size of the data. Upload date can be useful > too even if not requested by JMAP APIs. Backed by the BlobStore (and thus > ObjectStorage), we will need also a metadata system on top of it (Cassandra). > Data expiracy would be achieved via bucket deletion: all data uploaded in a > month are held in a bucket, and at month+2 the bucket can be dropped - in > order to ensure no data younger than a month is deleted. We can likely accept > dandling metadata as no critical data is help there (user, size, content > type). If needed a scroll could come and cleanup expired metadata, but it > might be expensive to run. > A webAdmin endpoint would trigger the cleanup and rely on an external > scheduler to trigger the cleanup. > We follow a similar design on the DeletedMessageVault > (https://issues.apache.org/jira/browse/JAMES-2811) > I bet my team could be working on this topic, but we do not have a plan on > this just yet. > *Impact* > - Blob uploaded before this proposed changed will be accessible via the use > of the AttachmentManager uploader right path (before its deletion), > inaccessible after > - Cleanup of blobContent uploaded before this change gets applied is a non > goal of my proposal. A separate batch could be use, reading cassandra data, > and deleting uploaded blobs. A task could maybe even be exposed for such > needs... > *Definition of done* > Demostrate data expiracy in an integration test, paying with a mocked clock > injected via guice. > Documentation needs to be written so that admins do not forget to schedule > the cleanup task. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org