If you do not implement something special for clean shutdown of inflight
exchanges then the normal error handling should take effect like you
mentioned.
So for example a db transaction should roll back. Some issue may be
that e.g. a service call can not be rolled back.
On the other hand I think implementing clean shutdown will add a lot of
complexity. The special code will only be executed for quite rare cases.
These two effects increase the change of programming errors in the code.
So I am with you that in most cases you can just implement normal error
handling and jsut live with the fact that inflight calls might run into
errors.
What I have seen on production systems is that they mark a machine to be
updated as inactive on a front end load balancer. So no new requests
come in and after some time you can quite safely update the bundles.
This is a quite low tech solution but I think exactly for this reason it
works so well.
So while I wanted to understand clean shutdown better for the discussion
on aries dev I do not think it should always be done.
Btw. For my current redesign of jpa I have one problem that I would like
to get some feedback / ideas.
I am providing a so called EmSupplier:
https://github.com/cschneider/jpa-experiments/blob/master/jpa-support/src/main/java/net/lr/jpa/impl/EMSupplierImpl.java
This class will be offered as a service per persistence unit and should
help to work with jpa. There is a precall method that will create an EM
on the thread. Then there is a get() to retrieve the local thread em and
a postcall that will close the EM again. As discussed a bundle should
have stopped all work when the stop method is done. In this case this
applies to the case where the PU bundle will be stopped. So the
EntityManagerFactory will also be deregistered and closed. As the
EMSupplier depends on the EMF it will also have to be closed.
Now the problem is that there might still be threads working on their
per thread EMs. The really safe way is to wait until all these threads
have closed their EMs. This is what I am doing now. To make it a little
more predictable I added a timeout and close the remaining EMs after the
timeout.
So the question is: Is this a best practice ? The clear disadvantage is
that stopping a PU bundle could take quite long (depending on timeout).
Would it be better to just let the threads close the EMs asynchronously
and ignore the fact that this might go wrong if the bundle is
uninstalled in the mean time.
Christian
Am 15.02.2015 um 18:38 schrieb Peter Kriens:
As always with design, it is about trade offs. As indicated in my
mail, the recovery time can be shortened if you can do a controlled
shutdown. I know this was a big issue with mainframes, however, I
doubt that with today’s highly distributed systems this is still very
relevant. In general, when I have the choice in these circumstances I
would rather focus on reducing startup time instead of trying to
manage shutdown more nicely.
I think the complexity of the additional recovery part is also
dangerous, especially since you will have a common path and one that
only gets executed when the shit really hits the fan. I think that is
worth some additional startup time in one of the many machines in the
cluster.
That said, every case is special. Just sharing my long experience in
seeing overly complicated solutions that looked good close up but
provided no real gain when you looked at the overall picture.
Kind regards,
Peter Kriens
On 15 feb. 2015, at 13:18, Graham Charters <chart...@uk.ibm.com
<mailto:chart...@uk.ibm.com>> wrote:
Hi Peter,
I think you and I see different customer use cases. As I mentioned at
the last OSGi f2f, we have customers whose applications take a
significant amount of time to start and they have many instance.
Rolling updates can therefore take a long time if full application
restart is necessary, so these customers want to minimise application
update time and disruption. These are transactional deployments with
failover so they can be recovered if someone trips over the power
chord, but that doesn't mean they want use this during normal
maintenance.
Regards, Graham.
Graham Charters PhD CEng MBCS PhD
STSM, WebSphere OSGi Applications & Liberty Repository Lead
Architect, Master Inventor
IBM United Kingdom Limited, MP 146, Hursley Park, Winchester, SO21
2JN, UK
Tel: +44 1962 816527 Email: chart...@uk.ibm.com
<mailto:chart...@uk.ibm.com>
Peter Kriens --- Re: [osgi-dev] How to cleanly update/uninstall
bundles ---
From: "Peter Kriens" <peter.kri...@aqute.biz
<mailto:peter.kri...@aqute.biz>>
To: "OSGi Developer Mail List" <osgi-dev@mail.osgi.org
<mailto:osgi-dev@mail.osgi.org>>
Date: Sun, 15 Feb 2015 11:48
Subject: Re: [osgi-dev] How to cleanly update/uninstall bundles
------------------------------------------------------------------------
I am not sure I agree with your conclusion. :-)
Since it is theoretically impossible to protect against hard failure
(power, kernel panic, kill -9, distributed call when the cable is
plugged, etc) any valuable application must have protection against
an unexpected exit at any moment in time. Idempotency, consensus, and
transactionality are your friends in these cases. So if you are
protected against these bad failures, how bad can an in-flight
shutdown be? Best case you can shorten the recovery time at restart
but this often requires additional complexity that can then also
fail. Since the chance that things go wrong in-flight is quite small
I would take the recovery cost in the unlikely event you got caught.
Related is my very old opposition to an update or uninstall callback
to the bundle. Though it is an awfully attractive idea with lots of
good stuff the party is spoiled because you cannot guarantee such a
call circumstances.
Billy Joy (Sun Founder) once told us a story about the development of
the Internet, of which he took part. Initially they tried to make
every router perfect but this turned the routers incredibly expensive
and there were still failure scenarios that even a perfect router
could not handle (power, cable cuts). Then someone proposed to assume
the routers were very imperfect and that the end points should
correct the problems in the net. This changed a very large number of
very hard to handle failure scenario into one problem: how to handle
a missing package. If a router panicked, lost power, a cable was
cost, too busy, out of memory, had no clue: discard the package.
It is a pervasive problem in Enterprise software world that we want
to ignore failure because it is so hard. For example, Blueprint has
this awful service damping that looks so attractive for the developer
(Look Ma, no dynamics!) but by hiding the reality you get caught in
lots of unexpected places.
Bad software expects an unchanging perfect world, good software is
more realistic. Embrace failure! :-)
Kind regards,
Peter Kriens
On 15 feb. 2015, at 11:09, Christian Schneider
<ch...@die-schneider.net <mailto:ch...@die-schneider.net>> wrote:
Thanks to all of you for the insights.
From the responses I take that clean shutdown is not in scope of
OSGi itself.
I agree that it is best solved on the application level. On the
other hand I see that the Quiesce API can at least cover some
cases and so it has its values.
Christian
Am 13.02.2015 um 17:55 schrieb Raymond Auge:
To my knowledge what you are speaking of is not intentionally
supported by the dynamics of osgi. This topic comes up all the
time, it's funny.
If you must support "in flight" changes, then you have to implement
this support in your code using concurrency constructs.
Note that unregistering a service is a synchronous operation during
"shutdown" of a bundle, and so with proper concurrency measures in
place, a bundle could both be shutting down (meaning it's not
reachable by other bundles) and also finishing any ongoing work.
Anyone feel free to correct me but this is what I've learned in my
short experience.
- Ray
_______________________________________________
OSGi Developer Mail List
osgi-dev@mail.osgi.org
https://mail.osgi.org/mailman/listinfo/osgi-dev
--
Christian Schneider
http://www.liquid-reality.de
Open Source Architect
Talend Application Integration Division http://www.talend.com
_______________________________________________
OSGi Developer Mail List
osgi-dev@mail.osgi.org
https://mail.osgi.org/mailman/listinfo/osgi-dev