Re: [pool] Recovering from transient factory outages

Romain Manni-Bucau Tue, 13 Feb 2024 23:24:52 -0800

Hi Phil,

You are right it can be done in pool - I'm not sure it is the right level
(for instance in my previous example it will need to expose some
"getCircuitBreakerState" to see if it can be used or not) but maybe I'm too
used to decorators ;).
The key point for [pool] is the last one, the proxying.
Pool can't do it itself since it manages banalised instances but if you add
the notion of proxy factory and fallback on jre proxy when it is only about
interfaces it will work.


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>


Le mar. 13 févr. 2024 à 22:38, Phil Steitz <phil.ste...@gmail.com> a écrit :

> Thanks, Romain, this is awesome.  I would really like to find a way to get
> this kind of thing implemented in [pool] or via enhanced factories.  See
> more on that below.
>
> On Tue, Feb 13, 2024 at 1:27 PM Romain Manni-Bucau <rmannibu...@gmail.com>
> wrote:
>
> > Hi Phil,
> >
> > What I used by the past for this kind of thing was to rely on the timeout
> > of the pool plus in the healthcheck - external to the pool - have some
> > trigger (the simplest was "if 5 healthchecks fail without any success in
> > between" for ex), such trigger will spawn a task (think thread even if it
> > uses an executor but guarantee to have a place for this task) which will
> > retry but at a faster pace (instead of every 30s it is 5 times in a run
> for
> > - number was tunable but 5 was my default).
> > If still detected as down - vs not overloaded or alike - it will consider
> > the database down and spawn a task which will retry every 30 seconds, if
> > the database comes back - I added some business check but idea is not
> just
> > check the connection but the tables are accessible cause often after
> such a
> > downtime the db does not come at once - just destroy/recreate the pool.
> > The destroy/recreate was handled using a DataSource proxy in front of the
> > pool and change the delegate.
> >
>
> It seems to me that all of this might be possible using what I was calling
> a ReslientFactory.  The factory could implement the health-checking itself,
> using pluggable strategies for how to check, how often, what means outage,
> etc.  And the factory could (if so configured and in the right state)
> bounce the pool.  I like the model of escalating concern.
>
>
> > Indeed it is not magic inside the pool but can only better work than the
> > pool solution cause you can integrate to your already existing checks and
> > add more advanced checks - if you have jpa just do a fast query on any
> > table to validate db is back for ex.
> > At the end code is pretty simple and has another big advantage: you can
> > circuit break the database completely while you consider the db is down
> > just letting passing 10% of whatever ratio you want - of the requests
> (kind
> > of canary testing which avoids too much pressure on the pool).
> >
> > I guess it was not exactly the answer you expected but think it can be a
> > good solution and ultimately can site in a new package in dbcp or alike?
> >
>
> I don't see anything here that is specific really to database connections
> (other than the proxy setup to gracefully handle bounces), so I want to
> keep thinking about how to solve the general problem by somehow enhancing
> factories and/or pools.
>
> Phil
>
> >
> > Best,
> > Romain Manni-Bucau
> > @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> > <https://rmannibucau.metawerx.net/> | Old Blog
> > <http://rmannibucau.wordpress.com> | Github <
> > https://github.com/rmannibucau> |
> > LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
> > <
> >
> https://www.packtpub.com/application-development/java-ee-8-high-performance
> > >
> >
> >
> > Le mar. 13 févr. 2024 à 21:11, Phil Steitz <phil.ste...@gmail.com> a
> > écrit :
> >
> > > POOL-407 tracks a basic liveness problem that we have never been able
> to
> > > solve:
> > >
> > > A factory "goes down" resulting in either failed object creation or
> > failed
> > > validation during the outage.  The pool has capacity to create, but the
> > > factory fails to serve threads as they arrive, so they end up parked
> > > waiting on the idle object pool.  After a possibly very brief
> > interruption,
> > > the factory heals itself (maybe a database comes back up) and the
> waiting
> > > threads can be served, but until other threads arrive, get served and
> > > return instances to the pool, the parked threads remain blocked.
> > > Configuring minIdle and pool maintenance (timeBetweenEvictionRuns > 0)
> > can
> > > improve the situation, but running the evictor at high enough frequency
> > to
> > > handle every transient failure is not a great solution.
> > >
> > > I am stuck on how to improve this.  I have experimented with the idea
> of
> > a
> > > ResilientFactory, placing the responsibility on the factory to know
> when
> > it
> > > is down and when it comes back up and when it does, to keep calling
> it's
> > > pool's create as long as it has take waiters and capacity; but I am not
> > > sure that is the best approach.  The advantage of this is that
> > > resource-specific failure and recovery-detection can be implemented.
> > >
> > > Another option that I have played with is to have the pool keep track
> of
> > > factory failures and when it observes enough failures over a long
> enough
> > > time, it starts a thread to do some kind of exponential backoff to keep
> > > retrying the factory.  Once the factory comes back, the recovery thread
> > > creates as many instances as it can without exceeding capacity and adds
> > > them to the pool.
> > >
> > > I don't really like either of these.  Anyone have any better ideas?
> > >
> > > Phil
> > >
> >
>

Re: [pool] Recovering from transient factory outages

Reply via email to