Re: Multi-tenancy and caching issues

Romain Manni-Bucau Tue, 09 Jan 2024 01:52:17 -0800

Hi Francesco,

While you have an EMF router you don't have pitfall 4, it only happens if
your routing is done at datasource level but it also means you have way
more side effects and you start to loose the hability to tune per tenant (a
common pattern is to tune the cache per tenant "size"/usage, there all
would be shared, not isolated so no real way to handle anything there).


Note: having routed caches can make it work somehow but will need a lot of
reimplementation of the cache whereas it is free when using a routed emf.
It can be faked with PartitionedDataCache overriding the key name
(appending the tenant) but in terms of supervision I fear it will be way
harder and I'm not sure it would be very consummable for people (you end up
making the leak risk higher for users by design and you don't get any
benefit from that - you don't reduce the overhead, you don't reduce the
pool size etc which are at another level).

In terms of spring-data integration there is also no link, just @Bean EMF
routedEmf() and you'll get it working transparently while a tx - cache
scope of spring - is for a single tenant.

Hope I'm not missing something "key" ;).

Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>


Le mar. 9 janv. 2024 à 10:32, Francesco Chicchiriccò <[email protected]>
a écrit :

> Hi Romain,
> see my replies embedded below.
>
> Regards.
>
> On 08/01/24 17:43, Romain Manni-Bucau wrote:
> > Hi Francesco,
> >
> > Normally if you have one EMF per tenant there is no leak between them
> since the cache instance is stored in the EMF - used that approach in TomEE.
>
> As I am saying below, this is what we have already in Syncope.
>
> My company is also supporting customers heavily using this particular
> feature: it works, I have no issues with that.
> Someone is also building a SaaS solution on top of that, so runtime tenant
> addition and removal is also fine.
>
> I am exploring this different approach because it would allow to introduce
> Spring Data JPA, which could have some benefits - see
> https://issues.apache.org/jira/browse/SYNCOPE-1799
>
> > You can check it in
> org.apache.openjpa.datacache.DataCacheManagerImpl#initialize of each emf
> which should be different.
>
> Thanks for the pointer.
>
> > So overall if there is a leak it is likely that it leaks accross
> transactions or some spring cache level.
>
> I think that things are more subtle: consider the following use case.
>
> We have MyEntity with String @Id.
>
> Suppose we have two tenants: A and B.
>
> 1. Tenant A will make a REST call which creates a MyEntity instance with
> key "key1" under the db for A.
>
> 2. Tenant A will make another REST call which looks for the newly created
> MyEntity instance via:
>
> entityManager.find(MyEntity.class, "key1");
>
> 3. Tenant B makes the same call as (1) with the same key "key1": all is
> fine, a new row is created under the db for B.
>
> 4. Tenant B makes the same call as (2) with the same key "key1": if not
> already evicted, entityManager will return the MyEntity instance for Tenant
> A from the cache.
>
> I need to avoid the pitfalls from (4).
>
> > Side note: the datasource routing pattern is useless if you have an
> entity manager routing pattern and only use JPA to do database work, both
> will more easily conflict than help.
>
> The idea is not to have an entity manager routing pattern, rather to have
> a cache routing patter on the single entity manager factory; or just to
> configure some predefined partitions.
>
> > If you still want to plug the datacase (query cache) configuration in
> the jpa properties can take a custom fully qualified name too.
> >
> > Le lun. 8 janv. 2024 à 17:14, Francesco Chicchiriccò <
> [email protected]>
> > a écrit :
> >
> >> Hi there,
> >> at Syncope we have been implementing multi-tenancy by relying on
> something
> >> like:
> >>
> >> * 1 data source per tenant
> >> * 1 entity manager factory per tenant
> >> * 1 transaction manager per tenant
> >> * etc
> >>
> >> So far so good.
> >>
> >> Now I am experimenting a different approach similar to [1], e.g.
> >>
> >> * 1 low-level data source per tenant
> >> * 1 data source extending Spring's AbstractRoutingDataSource using the
> >> value of a ThreadLocal variable as lookup key
> >> * 1 single entity manager factory configured with the routing data
> source
> >> * 1 single transaction manager
> >> * etc
> >>
> >> It mostly works but I am having caching issues with concurrent
> operations
> >> working on different tenants, so I was wondering: how can I extend the
> >> various OpenJPA (query, data, L1, L2, every one) caches to hold back
> >> different actual instances per tenant and to use the appropriate one
> >> depending on the same ThreadLocal value I have already used above for
> data
> >> sources?
> >>
> >> Thanks in advance.
> >> Regards.
> >>
> >> [1] https://github.com/Cepr0/sb-multitenant-db-demo
>
> --
> Francesco Chicchiriccò
>
> Tirasa - Open Source Excellence
> http://www.tirasa.net/
>
> Member at The Apache Software Foundation
> Syncope, Cocoon, Olingo, CXF, OpenJPA, PonyMail
> http://home.apache.org/~ilgrosso/
>
>

Re: Multi-tenancy and caching issues

Reply via email to