chibenwa commented on code in PR #2894: URL: https://github.com/apache/james-project/pull/2894#discussion_r2659913494
########## src/adr/0076-deleted-message-vault-single-bucket.md: ########## @@ -0,0 +1,40 @@ +# 76. Deleted Message Vault: single bucket usage + +Date: 2026-02-01 + +## Status + +Accepted (lazy consensus). + +## Context + +At the moment, the current deleted message vault uses multiple buckets to store deleted messages of users. Each bucket is generated Review Comment: What is the deleted message vault? Referencing https://github.com/apache/james-project/blob/master/src/adr/0075-deleted-message-vault.md seems a good way forward. (It is referenced below but let's make it obvious early with a link) ########## src/adr/0076-deleted-message-vault-single-bucket.md: ########## @@ -0,0 +1,40 @@ +# 76. Deleted Message Vault: single bucket usage + +Date: 2026-02-01 + +## Status + +Accepted (lazy consensus). + +## Context + +At the moment, the current deleted message vault uses multiple buckets to store deleted messages of users. Each bucket is generated +with a name corresponding to a year and a month, following this pattern: `deleted-messages-[year]-[month]-01`. + +Then we when run the GC tasks, every bucket that is older than the defined retention period is being deleted. Review Comment: I would rather speak of "purge" than GC as the underlying mechanism is based on expiration and not on external references removal. Sorry for being picky. ########## src/adr/0076-deleted-message-vault-single-bucket.md: ########## @@ -0,0 +1,40 @@ +# 76. Deleted Message Vault: single bucket usage + +Date: 2026-02-01 + +## Status + +Accepted (lazy consensus). + +## Context + +At the moment, the current deleted message vault uses multiple buckets to store deleted messages of users. Each bucket is generated +with a name corresponding to a year and a month, following this pattern: `deleted-messages-[year]-[month]-01`. + +Then we when run the GC tasks, every bucket that is older than the defined retention period is being deleted. + +However, this solution can be a bit costly in terms of bucket count with S3 object storages and can affect performance by +doing multiple API calls on multiple buckets at once. + +## Decision + +Using a single bucket for storing deleted messages instead! The objects in the single bucket would be following this name pattern: +`[year]/[month]/[blob_id]`. S3 buckets are flat but we cna still use the year and month as a prefix for the object name. + +For this we can: + +- provide a new implementation for the blob store deleted message vault that would store deleted messages on a single bucket. +- write only on the single bucket, fall back if necessary on old buckets for read and delete +- add the single bucket usage case to the GC task, that would do cleaning on both new and old buckets. + +## Consequences + +- easier to maintain, only one bucket! +- keep the bucket count for James low on S3 object storages +- read/write/delete operations on only one bucket, not multiple. + Review Comment: ```suggestion # Alternatives Specific James implementation could overload an unchanged deleted message vault and provide their own however we believe the problem and complexity of operating atop multiple bucket is detrimental for others in the community for minimal to no gains. ``` ########## src/adr/0076-deleted-message-vault-single-bucket.md: ########## @@ -0,0 +1,40 @@ +# 76. Deleted Message Vault: single bucket usage + +Date: 2026-02-01 + +## Status + +Accepted (lazy consensus). + +## Context + +At the moment, the current deleted message vault uses multiple buckets to store deleted messages of users. Each bucket is generated +with a name corresponding to a year and a month, following this pattern: `deleted-messages-[year]-[month]-01`. + +Then we when run the GC tasks, every bucket that is older than the defined retention period is being deleted. + +However, this solution can be a bit costly in terms of bucket count with S3 object storages and can affect performance by +doing multiple API calls on multiple buckets at once. + +## Decision + +Using a single bucket for storing deleted messages instead! The objects in the single bucket would be following this name pattern: Review Comment: ```suggestion Using a single bucket for storing deleted messages instead. The objects in the single bucket would be following this name pattern: ``` Please use a neutral tone. I do not know what `!` is used for, but it have no rooms in such a formal document. Let's keep this for coffee machine discussions ;-) ########## src/adr/0076-deleted-message-vault-single-bucket.md: ########## @@ -0,0 +1,40 @@ +# 76. Deleted Message Vault: single bucket usage + +Date: 2026-02-01 + +## Status + +Accepted (lazy consensus). + +## Context + +At the moment, the current deleted message vault uses multiple buckets to store deleted messages of users. Each bucket is generated +with a name corresponding to a year and a month, following this pattern: `deleted-messages-[year]-[month]-01`. + +Then we when run the GC tasks, every bucket that is older than the defined retention period is being deleted. + +However, this solution can be a bit costly in terms of bucket count with S3 object storages and can affect performance by +doing multiple API calls on multiple buckets at once. + +## Decision + +Using a single bucket for storing deleted messages instead! The objects in the single bucket would be following this name pattern: +`[year]/[month]/[blob_id]`. S3 buckets are flat but we cna still use the year and month as a prefix for the object name. + +For this we can: + +- provide a new implementation for the blob store deleted message vault that would store deleted messages on a single bucket. +- write only on the single bucket, fall back if necessary on old buckets for read and delete +- add the single bucket usage case to the GC task, that would do cleaning on both new and old buckets. + +## Consequences + +- easier to maintain, only one bucket! +- keep the bucket count for James low on S3 object storages +- read/write/delete operations on only one bucket, not multiple. Review Comment: Interestingly theres also an access control point here. James no longer requires right to create bucket at runtime when Deleted Message Vault is enabled. ########## src/adr/0076-deleted-message-vault-single-bucket.md: ########## @@ -0,0 +1,40 @@ +# 76. Deleted Message Vault: single bucket usage + +Date: 2026-02-01 + +## Status + +Accepted (lazy consensus). + +## Context + +At the moment, the current deleted message vault uses multiple buckets to store deleted messages of users. Each bucket is generated +with a name corresponding to a year and a month, following this pattern: `deleted-messages-[year]-[month]-01`. + +Then we when run the GC tasks, every bucket that is older than the defined retention period is being deleted. + +However, this solution can be a bit costly in terms of bucket count with S3 object storages and can affect performance by +doing multiple API calls on multiple buckets at once. + +## Decision + +Using a single bucket for storing deleted messages instead! The objects in the single bucket would be following this name pattern: +`[year]/[month]/[blob_id]`. S3 buckets are flat but we cna still use the year and month as a prefix for the object name. + +For this we can: + +- provide a new implementation for the blob store deleted message vault that would store deleted messages on a single bucket. +- write only on the single bucket, fall back if necessary on old buckets for read and delete +- add the single bucket usage case to the GC task, that would do cleaning on both new and old buckets. + +## Consequences + +- easier to maintain, only one bucket! +- keep the bucket count for James low on S3 object storages +- read/write/delete operations on only one bucket, not multiple. Review Comment: We need to mention the need of a migration. We need to detail how this migration is to be done. ########## src/adr/0076-deleted-message-vault-single-bucket.md: ########## @@ -0,0 +1,40 @@ +# 76. Deleted Message Vault: single bucket usage + +Date: 2026-02-01 + +## Status + +Accepted (lazy consensus). + +## Context + +At the moment, the current deleted message vault uses multiple buckets to store deleted messages of users. Each bucket is generated +with a name corresponding to a year and a month, following this pattern: `deleted-messages-[year]-[month]-01`. + +Then we when run the GC tasks, every bucket that is older than the defined retention period is being deleted. + +However, this solution can be a bit costly in terms of bucket count with S3 object storages and can affect performance by +doing multiple API calls on multiple buckets at once. Review Comment: Please mention that some provider put limits on count of bucket per account eg OVH. ########## src/adr/0076-deleted-message-vault-single-bucket.md: ########## @@ -0,0 +1,40 @@ +# 76. Deleted Message Vault: single bucket usage + +Date: 2026-02-01 + +## Status + +Accepted (lazy consensus). + +## Context + +At the moment, the current deleted message vault uses multiple buckets to store deleted messages of users. Each bucket is generated +with a name corresponding to a year and a month, following this pattern: `deleted-messages-[year]-[month]-01`. + +Then we when run the GC tasks, every bucket that is older than the defined retention period is being deleted. + +However, this solution can be a bit costly in terms of bucket count with S3 object storages and can affect performance by +doing multiple API calls on multiple buckets at once. + +## Decision + +Using a single bucket for storing deleted messages instead! The objects in the single bucket would be following this name pattern: +`[year]/[month]/[blob_id]`. S3 buckets are flat but we cna still use the year and month as a prefix for the object name. + +For this we can: + +- provide a new implementation for the blob store deleted message vault that would store deleted messages on a single bucket. +- write only on the single bucket, fall back if necessary on old buckets for read and delete +- add the single bucket usage case to the GC task, that would do cleaning on both new and old buckets. + +## Consequences + +- easier to maintain, only one bucket! Review Comment: ```suggestion - easier to maintain, only one bucket ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
