Benoit Tellier created JAMES-3148:
-------------------------------------

             Summary: Cassandra mailbox deletion cleanup
                 Key: JAMES-3148
                 URL: https://issues.apache.org/jira/browse/JAMES-3148
             Project: James Server
          Issue Type: New Feature
          Components: cassandra, mailbox
    Affects Versions: 3.5.0
            Reporter: Benoit Tellier
             Fix For: 3.6.0


Cassandra is used within distributed James product to hold messages and 
mailboxes metadata.

Cassandra holds the following tables:
 - mailboxPathV2 + mailbox allowing to retrieve mailboxes informations
 - acl + UserMailboxACL holds denormalized information
 - messageIdTable & imapUidTable allows to retrieve mailbox context information
 - messageV2 table holds message matadata
 - attachmentV2 holds attachment for messages
 - References to these attachments are contained within the attachmentOwner and 
attachmentMessageId tables
 
Currently, the deletion only deletes the first level of metadata. Lower level 
metadata stay unreachable. The data looks 
deleted but references are actually still present.

Concretely:
 - Upon mailbox deletion, only mailboxPathV2 & mailbox content is deleted. 
messageIdTable, imapUidTable, messageV2, 
 attachmentV2 & attachmentMessageId metadata is left undeleted.
 - Upon mailbox deletion, acl + UserMailboxACL is not deleted.
 - Upon message deletion, only messageIdTable & imapUidTable content is 
deleted. messageV2, attachmentV2 & 
 attachmentMessageId metadata is left undeleted.

This jeopardize efforts to regain disk space and privacy, for example through 
blobStore garbage collection.

We need to cleanup Cassandra metadata. They can be retrieved from dandling 
metadata after the delete operation had been 
conducted out. We need to delete the lower levels first so that upon failures 
undeleted metadata can still be reached.

This cleanup is not needed for strict correctness from a MailboxManager point 
of view thus it could be carried out 
asynchronously, via mailbox listeners so that it can be retried.

Mailbox listener failures leads to eventBus retrying their execution, we need 
to ensure the result of the deletion to be 
idempotent. This might have consequences on the blobStore garbage collection 
design.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to