Re: [PR] JAMES-4156 ADR: Deleted message vault single bucket usage [james-project]

via GitHub Sun, 04 Jan 2026 12:45:23 -0800


chibenwa commented on code in PR #2894:
URL: https://github.com/apache/james-project/pull/2894#discussion_r2659913494



##########
src/adr/0076-deleted-message-vault-single-bucket.md:
##########
@@ -0,0 +1,40 @@
+# 76. Deleted Message Vault: single bucket usage
+
+Date: 2026-02-01
+
+## Status
+
+Accepted (lazy consensus).
+
+## Context
+
+At the moment, the current deleted message vault uses multiple buckets to 
store deleted messages of users. Each bucket is generated

Review Comment:
   What is the deleted message vault? Referencing 
https://github.com/apache/james-project/blob/master/src/adr/0075-deleted-message-vault.md
 seems a good way forward.
   
   (It is referenced below but let's make it obvious early with a link)



##########
src/adr/0076-deleted-message-vault-single-bucket.md:
##########
@@ -0,0 +1,40 @@
+# 76. Deleted Message Vault: single bucket usage
+
+Date: 2026-02-01
+
+## Status
+
+Accepted (lazy consensus).
+
+## Context
+
+At the moment, the current deleted message vault uses multiple buckets to 
store deleted messages of users. Each bucket is generated
+with a name corresponding to a year and a month, following this pattern: 
`deleted-messages-[year]-[month]-01`.
+
+Then we when run the GC tasks, every bucket that is older than the defined 
retention period is being deleted.

Review Comment:
   I would rather speak of "purge" than GC as the underlying mechanism is based 
on expiration and not on  external references removal. Sorry for being picky.



##########
src/adr/0076-deleted-message-vault-single-bucket.md:
##########
@@ -0,0 +1,40 @@
+# 76. Deleted Message Vault: single bucket usage
+
+Date: 2026-02-01
+
+## Status
+
+Accepted (lazy consensus).
+
+## Context
+
+At the moment, the current deleted message vault uses multiple buckets to 
store deleted messages of users. Each bucket is generated
+with a name corresponding to a year and a month, following this pattern: 
`deleted-messages-[year]-[month]-01`.
+
+Then we when run the GC tasks, every bucket that is older than the defined 
retention period is being deleted.
+
+However, this solution can be a bit costly in terms of bucket count with S3 
object storages and can affect performance by 
+doing multiple API calls on multiple buckets at once.
+
+## Decision
+
+Using a single bucket for storing deleted messages instead! The objects in the 
single bucket would be following this name pattern:
+`[year]/[month]/[blob_id]`. S3 buckets are flat but we cna still use the year 
and month as a prefix for the object name.
+
+For this we can:
+
+- provide a new implementation for the blob store deleted message vault that 
would store deleted messages on a single bucket.
+- write only on the single bucket, fall back if necessary on old buckets for 
read and delete
+- add the single bucket usage case to the GC task, that would do cleaning on 
both new and old buckets.
+
+## Consequences
+
+- easier to maintain, only one bucket!
+- keep the bucket count for James low on S3 object storages
+- read/write/delete operations on only one bucket, not multiple.
+

Review Comment:
   ```suggestion
   # Alternatives
   
   Specific James implementation could overload an unchanged deleted message 
vault and provide their own however we believe the problem and complexity of 
operating atop multiple bucket is detrimental for others in the community for 
minimal to no gains.
   ```



##########
src/adr/0076-deleted-message-vault-single-bucket.md:
##########
@@ -0,0 +1,40 @@
+# 76. Deleted Message Vault: single bucket usage
+
+Date: 2026-02-01
+
+## Status
+
+Accepted (lazy consensus).
+
+## Context
+
+At the moment, the current deleted message vault uses multiple buckets to 
store deleted messages of users. Each bucket is generated
+with a name corresponding to a year and a month, following this pattern: 
`deleted-messages-[year]-[month]-01`.
+
+Then we when run the GC tasks, every bucket that is older than the defined 
retention period is being deleted.
+
+However, this solution can be a bit costly in terms of bucket count with S3 
object storages and can affect performance by 
+doing multiple API calls on multiple buckets at once.
+
+## Decision
+
+Using a single bucket for storing deleted messages instead! The objects in the 
single bucket would be following this name pattern:

Review Comment:
   ```suggestion
   Using a single bucket for storing deleted messages instead. The objects in 
the single bucket would be following this name pattern:
   ```
   
   Please use a neutral tone. I do not know what `!` is used for, but it have 
no rooms in such a formal document. Let's keep this for coffee machine 
discussions ;-)



##########
src/adr/0076-deleted-message-vault-single-bucket.md:
##########
@@ -0,0 +1,40 @@
+# 76. Deleted Message Vault: single bucket usage
+
+Date: 2026-02-01
+
+## Status
+
+Accepted (lazy consensus).
+
+## Context
+
+At the moment, the current deleted message vault uses multiple buckets to 
store deleted messages of users. Each bucket is generated
+with a name corresponding to a year and a month, following this pattern: 
`deleted-messages-[year]-[month]-01`.
+
+Then we when run the GC tasks, every bucket that is older than the defined 
retention period is being deleted.
+
+However, this solution can be a bit costly in terms of bucket count with S3 
object storages and can affect performance by 
+doing multiple API calls on multiple buckets at once.
+
+## Decision
+
+Using a single bucket for storing deleted messages instead! The objects in the 
single bucket would be following this name pattern:
+`[year]/[month]/[blob_id]`. S3 buckets are flat but we cna still use the year 
and month as a prefix for the object name.
+
+For this we can:
+
+- provide a new implementation for the blob store deleted message vault that 
would store deleted messages on a single bucket.
+- write only on the single bucket, fall back if necessary on old buckets for 
read and delete
+- add the single bucket usage case to the GC task, that would do cleaning on 
both new and old buckets.
+
+## Consequences
+
+- easier to maintain, only one bucket!
+- keep the bucket count for James low on S3 object storages
+- read/write/delete operations on only one bucket, not multiple.

Review Comment:
   Interestingly theres also an access control point here. James no longer 
requires right to create bucket at runtime when Deleted Message Vault is 
enabled.



##########
src/adr/0076-deleted-message-vault-single-bucket.md:
##########
@@ -0,0 +1,40 @@
+# 76. Deleted Message Vault: single bucket usage
+
+Date: 2026-02-01
+
+## Status
+
+Accepted (lazy consensus).
+
+## Context
+
+At the moment, the current deleted message vault uses multiple buckets to 
store deleted messages of users. Each bucket is generated
+with a name corresponding to a year and a month, following this pattern: 
`deleted-messages-[year]-[month]-01`.
+
+Then we when run the GC tasks, every bucket that is older than the defined 
retention period is being deleted.
+
+However, this solution can be a bit costly in terms of bucket count with S3 
object storages and can affect performance by 
+doing multiple API calls on multiple buckets at once.
+
+## Decision
+
+Using a single bucket for storing deleted messages instead! The objects in the 
single bucket would be following this name pattern:
+`[year]/[month]/[blob_id]`. S3 buckets are flat but we cna still use the year 
and month as a prefix for the object name.
+
+For this we can:
+
+- provide a new implementation for the blob store deleted message vault that 
would store deleted messages on a single bucket.
+- write only on the single bucket, fall back if necessary on old buckets for 
read and delete
+- add the single bucket usage case to the GC task, that would do cleaning on 
both new and old buckets.
+
+## Consequences
+
+- easier to maintain, only one bucket!
+- keep the bucket count for James low on S3 object storages
+- read/write/delete operations on only one bucket, not multiple.

Review Comment:
   We need to mention the need of a migration. We need to detail how this 
migration is to be done.



##########
src/adr/0076-deleted-message-vault-single-bucket.md:
##########
@@ -0,0 +1,40 @@
+# 76. Deleted Message Vault: single bucket usage
+
+Date: 2026-02-01
+
+## Status
+
+Accepted (lazy consensus).
+
+## Context
+
+At the moment, the current deleted message vault uses multiple buckets to 
store deleted messages of users. Each bucket is generated
+with a name corresponding to a year and a month, following this pattern: 
`deleted-messages-[year]-[month]-01`.
+
+Then we when run the GC tasks, every bucket that is older than the defined 
retention period is being deleted.
+
+However, this solution can be a bit costly in terms of bucket count with S3 
object storages and can affect performance by 
+doing multiple API calls on multiple buckets at once.

Review Comment:
   Please mention that some provider put limits on count of bucket per account 
eg OVH.



##########
src/adr/0076-deleted-message-vault-single-bucket.md:
##########
@@ -0,0 +1,40 @@
+# 76. Deleted Message Vault: single bucket usage
+
+Date: 2026-02-01
+
+## Status
+
+Accepted (lazy consensus).
+
+## Context
+
+At the moment, the current deleted message vault uses multiple buckets to 
store deleted messages of users. Each bucket is generated
+with a name corresponding to a year and a month, following this pattern: 
`deleted-messages-[year]-[month]-01`.
+
+Then we when run the GC tasks, every bucket that is older than the defined 
retention period is being deleted.
+
+However, this solution can be a bit costly in terms of bucket count with S3 
object storages and can affect performance by 
+doing multiple API calls on multiple buckets at once.
+
+## Decision
+
+Using a single bucket for storing deleted messages instead! The objects in the 
single bucket would be following this name pattern:
+`[year]/[month]/[blob_id]`. S3 buckets are flat but we cna still use the year 
and month as a prefix for the object name.
+
+For this we can:
+
+- provide a new implementation for the blob store deleted message vault that 
would store deleted messages on a single bucket.
+- write only on the single bucket, fall back if necessary on old buckets for 
read and delete
+- add the single bucket usage case to the GC task, that would do cleaning on 
both new and old buckets.
+
+## Consequences
+
+- easier to maintain, only one bucket!

Review Comment:
   ```suggestion
   - easier to maintain, only one bucket
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] JAMES-4156 ADR: Deleted message vault single bucket usage [james-project]

Reply via email to