Joe, I had similar thoughts about the additional resource usage this adds, which is why I thought exploring the downsides was apropos. I think one thing I initially thought was that a lot of the "simple" core processors wouldn't gain any advantage by doing this... but honestly, I'm not sure what is a bigger problem, having a lot of processors and this chewing up resources, or inability to put the combination of processors / configurations together that you need. Could we have the best of both worlds by having a classloaders per processor at the discretion of either the operator (or maybe processor developer)? This adds more code complexity, for sure.
I think I mentioned this before on pull request a long time ago, it sure is hard to make decisions like this without having better metrics about who is using what part of nifi. I wonder what a way of reporting finer grained usage would look like in the apache world. On Wed, Mar 9, 2016 at 8:14 AM, Joe Witt <[email protected]> wrote: > There are clear benefits to having the notion of extension isolation > be as narrow as a single instance of that extension in the flow. > However, there are also some important questions that must be > answered. > > A quick one that comes to mind is the idea of a classloader per > extension instance means the same classes will not only be added many > times but they'll be added potentially an unbounded number of times as > a given flow grows. Today the number of classloaders is bounded/set > at startup. This is something we need good numbers on in terms of > overhead/cost. We could address this by letting in hints that say we > can collapse/share classloaders and so on. But then we also need to > be careful how far we go with this. > > We could have chosen alternative componentization models long ago but > held back due to complexity. We were happy to lose some of the power > to ditch most of the complexity. Not saying this concept goes too far > but saying we need to always figure out the right sweet spot and that > inherently means when we have "enough capability vs complexity". > > Given the clearly very early stage of this discussion my personal > preference is to see is it end up as a feature proposal/design doc on > the wiki page with some of the others. We have seen over time that > folks in the community not actively watching the mailing lists do > notice those proposals and tend to bring them up or want to engage on > them later. The Wiki/feature proposal section makes that easier. > > Thanks > Joe > > On Wed, Mar 9, 2016 at 7:32 AM, Oleg Zhurakousky > <[email protected]> wrote: > > Well, sure there is the obvious; two instances of the same NAR = two > instances of the same class in memory. But that’s a very small price to pay > when realizing that current state of things can simply render NiFi > un-usable. In fact we already had similar issue with HDFS processors ( > https://issues.apache.org/jira/browse/NIFI-1536) that has UGI Kerberos > code which uses the same static initializer model. I’ve patched it few > weeks ago, but I must say it’s a true patch (a band-aid) to address an > immediate problem, but the core issue is still there and could resurrect > itself at any time. > > > > As for the complexity, sure, ClassLoaders are one of those areas in Java > that is generally perceived as complex. I happen to navigate it with ease > due to things I’ve done in the previous life, so I can help (with code and > documentation to ensure its maintainable), but wanted to see what the > general feel is. > > > > Keep in mind, the concept is not new. In fact I’d go as far as saying > it’s pretty much a standard in server architectures. On top of that we kind > of heading that direction anyway, since the minute we introduce Extension > Registry with versioning, we bring on-demand deployment and at that time > ClassLoader per instance will become the most natural and simple thing to > do, so might as well start earlier. > > > > Cheers > > Oleg > > > > > >> On Mar 8, 2016, at 7:35 PM, Tony Kurc <[email protected]> wrote: > >> > >> Oleg, > >> What do you think are the downsides of doing this? Memory usage? > Additional > >> complexity? > >> > >> Tony > >> On Mar 8, 2016 9:54 AM, "Oleg Zhurakousky" < > [email protected]> > >> wrote: > >> > >>> Was wondering what others are thinking on the following: > >>> > >>> We have several components (Processors, ControllerServices etc.) both > >>> existing and coming down the pipeline which rely on class-level > >>> initializers (see example below from new SNMP PR) > >>> SecurityModels.getInstance().addSecurityModel(usm); > >>> While it’s a common pattern for certain types of use cases it doesn’t > go > >>> well with the flexibility we try to promote within NiFi. Specifically > the > >>> ability to have two different components that rely on such initializers > >>> being different or in different states. This is because multiple > instances > >>> of the same component will be loaded by the same NAR ClassLoader and > since > >>> such initializers maintain the state at the class level (singleton), > they > >>> are shared across all instances of the component. So, the above example > >>> will set security model for a processor where such security model was > >>> required and it will immediately be available to another instance of > the > >>> same type processor where it may not be required or supported causing > hard > >>> to explain/debug errors. > >>> > >>> There is a simple ClassLoader trick that we can discuss and implement > to > >>> alleviate this (I’ve done it for another processor that is coming down > the > >>> pipeline), but first I would like to know what others think, since the > more > >>> I think about it the more I feel it is global concern and as such > would be > >>> better addressed at the framework level. > >>> > >>> Thoughts > >>> > >>> Oleg > >>> > > >
