Hello Quan,
I'd mention multi-tenancy was one of the goal of the ongoing PostgreSQL
implementation and to be fully effective needs to be pushed to other
datastore, the most important being indeed the object store, holding
email content.
On 04/11/2024 04:45, Quan tran hong wrote:
Hi everyone,
Today James does not support multi-tenancy for blob store. Therefore blob
isolation between domains could be an issue for example in a SaaS
deployment that requires strict data isolation for users.
We (Linagora) think it would be good to implement multiple tenancies for
the blob store and would like to contribute it. We would like to propose
the idea to implement the need and love to hear the community's thoughts.
Firstly, we propose to refactor a bit the `BlobStore` API to accept the
tenant information.
We introduce some classes to contain the tenant information:
```
public record Tenant(String name) {
public static Tenant from(Domain domain) {
return new Tenant(domain.asString());
}
public String asString() {
return name;
}
}
public record Bucket(BucketName bucketName, Optional<Tenant> tenant) {
public static Bucket of(BucketName bucketName) {
return new Bucket(bucketName, Optional.empty());
}
}
```
We refactor the `BlobStore` APIs to accept the tenant input e.g.
`InputStream read(BucketName bucketName, BlobId blobId);` to
`InputStream read(Bucket bucket, BlobId blobId);`.
As said in the pull request limiting method creep in that class looks
like a sane handover we likely should try to tackle: this API is already
hard to navigate with it's zillion options and exponential growth do not
seems like a desirable direction.
Then each implementation (S3, File, Postgres...) can choose if it
implement the multi-tenancy.
Hereby we propose some options to implement multi-tenancy for S3.
## S3*### Configuration*
```
multi-tenancy.mode=none|bucket|ssec|prefix
```
Default to no multi-tenancy behavior, as of today.
*### S3 multi-tenancy options*
*#### bucket*
Each tenant uses one dedicated S3 bucket.
Notes: GC is likely broken and shall be tested with this mode.
*#### prefix*
Each tenant uses one dedicated prefix while sharing a same bucket.
Notes: We shall make sure the GC, when listing only takes the last
part of the s3Key IE given `prefix/ABC` the GC only uses ABC as a
blobId.
*#### ssec (server side encryption - client)*
Each tenant would use a derivated encryption SSE-C key to encrypt/decrypt
data. That way tenant A won't have the tenant B's key to negatively impact
tenant B data.
Notes: This implementation should fail with deduplicating blobStore.
Unless if prefix are used ^^
That should be very core idea about the S3 multi-tenancy implementation. We
plan to implement blob store multi-tenancy for other implementations e.g.
File, PostgreSQL and Memory too. For more details on those implementations
please have a look at the Jira ticket cf
https://issues.apache.org/jira/projects/JAMES/issues/JAMES-4085.
cc @Linagora colleagues please add on me if I am wrong or missing anything.
We would love to hear the community feeback on this. Is this topic
interesting to you? What else implementation you have in mind? Please let
us know :-).
Thank for reading.
Regards,
Quan
Thanks for bringing this topic to the mailing list, and best regards,
Benoit
---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org