I put this review on my list and will really try to go through the code. This is very important work you are doing Christian.
Kind regards, Peter Kriens > On 15 feb. 2015, at 19:07, Christian Schneider <ch...@die-schneider.net> > wrote: > > If you do not implement something special for clean shutdown of inflight > exchanges then the normal error handling should take effect like you > mentioned. > So for example a db transaction should roll back. Some issue may be that > e.g. a service call can not be rolled back. > > On the other hand I think implementing clean shutdown will add a lot of > complexity. The special code will only be executed for quite rare cases. > These two effects increase the change of programming errors in the code. > So I am with you that in most cases you can just implement normal error > handling and jsut live with the fact that inflight calls might run into > errors. > > What I have seen on production systems is that they mark a machine to be > updated as inactive on a front end load balancer. So no new requests come in > and after some time you can quite safely update the bundles. > This is a quite low tech solution but I think exactly for this reason it > works so well. > > So while I wanted to understand clean shutdown better for the discussion on > aries dev I do not think it should always be done. > > Btw. For my current redesign of jpa I have one problem that I would like to > get some feedback / ideas. > I am providing a so called EmSupplier: > https://github.com/cschneider/jpa-experiments/blob/master/jpa-support/src/main/java/net/lr/jpa/impl/EMSupplierImpl.java > > <https://github.com/cschneider/jpa-experiments/blob/master/jpa-support/src/main/java/net/lr/jpa/impl/EMSupplierImpl.java> > > This class will be offered as a service per persistence unit and should help > to work with jpa. There is a precall method that will create an EM on the > thread. Then there is a get() to retrieve the local thread em and a postcall > that will close the EM again. As discussed a bundle should have stopped all > work when the stop method is done. In this case this applies to the case > where the PU bundle will be stopped. So the EntityManagerFactory will also be > deregistered and closed. As the EMSupplier depends on the EMF it will also > have to be closed. > > Now the problem is that there might still be threads working on their per > thread EMs. The really safe way is to wait until all these threads have > closed their EMs. This is what I am doing now. To make it a little more > predictable I added a timeout and close the remaining EMs after the timeout. > > So the question is: Is this a best practice ? The clear disadvantage is that > stopping a PU bundle could take quite long (depending on timeout). Would it > be better to just let the threads close the EMs asynchronously and ignore the > fact that this might go wrong if the bundle is uninstalled in the mean time. > > Christian > > Am 15.02.2015 um 18:38 schrieb Peter Kriens: >> As always with design, it is about trade offs. As indicated in my mail, the >> recovery time can be shortened if you can do a controlled shutdown. I know >> this was a big issue with mainframes, however, I doubt that with today’s >> highly distributed systems this is still very relevant. In general, when I >> have the choice in these circumstances I would rather focus on reducing >> startup time instead of trying to manage shutdown more nicely. >> >> I think the complexity of the additional recovery part is also dangerous, >> especially since you will have a common path and one that only gets executed >> when the shit really hits the fan. I think that is worth some additional >> startup time in one of the many machines in the cluster. >> >> That said, every case is special. Just sharing my long experience in seeing >> overly complicated solutions that looked good close up but provided no real >> gain when you looked at the overall picture. >> >> Kind regards, >> >> >> Peter Kriens >> >> >> >> >> >> >> >> >> >> >>> On 15 feb. 2015, at 13:18, Graham Charters <chart...@uk.ibm.com >>> <mailto:chart...@uk.ibm.com>> wrote: >>> >>> Hi Peter, >>> >>> >>> >>> I think you and I see different customer use >>> cases. As I mentioned at the last OSGi f2f, we >>> have customers whose applications take a >>> significant amount of time to start and they >>> have many instance. Rolling updates can >>> therefore take a long time if full application >>> restart is necessary, so these customers want to >>> minimise application update time and disruption. >>> These are transactional deployments with >>> failover so they can be recovered if someone >>> trips over the power chord, but that doesn't >>> mean they want use this during normal >>> maintenance. >>> >>> >>> >>> >>> >>> Regards, Graham. >>> >>> >>> >>> Graham Charters PhD CEng MBCS PhD >>> >>> STSM, WebSphere OSGi Applications & Liberty >>> Repository Lead Architect, Master Inventor >>> >>> IBM United Kingdom Limited, MP 146, Hursley >>> Park, Winchester, SO21 2JN, UK >>> >>> Tel: +44 1962 816527 Email: chart...@uk.ibm.com >>> <mailto:chart...@uk.ibm.com> >>> >>> Peter Kriens --- Re: [osgi-dev] How to cleanly update/uninstall bundles --- >>> >>> From: "Peter Kriens" <peter.kri...@aqute.biz >>> <mailto:peter.kri...@aqute.biz>> >>> To: "OSGi Developer Mail List" <osgi-dev@mail.osgi.org >>> <mailto:osgi-dev@mail.osgi.org>> >>> Date: Sun, 15 Feb 2015 11:48 >>> Subject: Re: [osgi-dev] How to cleanly update/uninstall bundles >>> >>> I am not sure I agree with your conclusion. :-) >>> >>> Since it is theoretically impossible to protect against hard failure >>> (power, kernel panic, kill -9, distributed call when the cable is plugged, >>> etc) any valuable application must have protection against an unexpected >>> exit at any moment in time. Idempotency, consensus, and transactionality >>> are your friends in these cases. So if you are protected against these bad >>> failures, how bad can an in-flight shutdown be? Best case you can shorten >>> the recovery time at restart but this often requires additional complexity >>> that can then also fail. Since the chance that things go wrong in-flight is >>> quite small I would take the recovery cost in the unlikely event you got >>> caught. >>> >>> Related is my very old opposition to an update or uninstall callback to the >>> bundle. Though it is an awfully attractive idea with lots of good stuff the >>> party is spoiled because you cannot guarantee such a call circumstances. >>> >>> Billy Joy (Sun Founder) once told us a story about the development of the >>> Internet, of which he took part. Initially they tried to make every router >>> perfect but this turned the routers incredibly expensive and there were >>> still failure scenarios that even a perfect router could not handle (power, >>> cable cuts). Then someone proposed to assume the routers were very >>> imperfect and that the end points should correct the problems in the net. >>> This changed a very large number of very hard to handle failure scenario >>> into one problem: how to handle a missing package. If a router panicked, >>> lost power, a cable was cost, too busy, out of memory, had no clue: discard >>> the package. >>> >>> It is a pervasive problem in Enterprise software world that we want to >>> ignore failure because it is so hard. For example, Blueprint has this awful >>> service damping that looks so attractive for the developer (Look Ma, no >>> dynamics!) but by hiding the reality you get caught in lots of unexpected >>> places. >>> >>> Bad software expects an unchanging perfect world, good software is more >>> realistic. Embrace failure! :-) >>> >>> Kind regards, >>> >>> Peter Kriens >>> >>> >>>> On 15 feb. 2015, at 11:09, Christian Schneider <ch...@die-schneider.net >>>> <mailto:ch...@die-schneider.net>> wrote: >>>> >>>> Thanks to all of you for the insights. >>>> >>>> From the responses I take that clean shutdown is not in scope of OSGi >>>> itself. >>>> I agree that it is best solved on the application level. On the other hand >>>> I see that the Quiesce API can >>>> at least cover some >>>> cases and so it has its values. >>>> >>>> Christian >>>> >>>> Am 13.02.2015 um 17:55 schrieb Raymond Auge: >>>>> To my knowledge what you are speaking of is not intentionally supported >>>>> by the dynamics of osgi. This topic comes up all the time, it's funny. >>>>> >>>>> If you must support "in flight" changes, then you have to implement this >>>>> support in your code using concurrency constructs. >>>>> >>>>> Note that unregistering a service is a synchronous operation during >>>>> "shutdown" of a bundle, and so with >>>>> proper concurrency measures in place, a bundle could both >>>>> be shutting down (meaning it's not reachable by other bundles) and also >>>>> finishing any ongoing work. >>>>> >>>>> Anyone feel free to correct me but this is what I've learned in my short >>>>> experience. >>>>> >>>>> - Ray >> >> >> >> _______________________________________________ >> OSGi Developer Mail List >> osgi-dev@mail.osgi.org <mailto:osgi-dev@mail.osgi.org> >> https://mail.osgi.org/mailman/listinfo/osgi-dev >> <https://mail.osgi.org/mailman/listinfo/osgi-dev> > > -- > > Christian Schneider > http://www.liquid-reality.de <http://www.liquid-reality.de/> > > Open Source Architect > Talend Application Integration Division http://www.talend.com > <http://www.talend.com/> > _______________________________________________ > OSGi Developer Mail List > osgi-dev@mail.osgi.org > https://mail.osgi.org/mailman/listinfo/osgi-dev
_______________________________________________ OSGi Developer Mail List osgi-dev@mail.osgi.org https://mail.osgi.org/mailman/listinfo/osgi-dev