Matt We are (I am) working on Kerberos CS, but keep in mind that I am using Kerberos and UGI only as an example. There are many other cases like this. The bottom line is we need a mechanism to provide CL isolation per instance when it’s due.
Oleg > On Mar 9, 2016, at 8:54 AM, Matt Burgess <[email protected]> wrote: > > Why not make UGI/Kerb stuff a Controller Service? Each CS instance can have > class loader isolation and can be shared among processor instances that want > to share UGI/Kerb/service config? > > Sent from my iPhone > >> On Mar 9, 2016, at 8:50 AM, Oleg Zhurakousky <[email protected]> >> wrote: >> >> Tony >> Interesting choice of words “at the discretion. . .”. That can certainly be >> a happy medium as I do agree that 90% of them won’t need that. Further more, >> the extensions that are developed/managed by NiFi we can apply “the >> discretion” behind the scenes, thus not putting a burden on the DFM. For >> example knowing that HDFS bundle uses UGI we can apply CL isolation on that >> bundle but not others. The same would go for SNMP and others that we know >> Then it only leaves the once we can’t control (user NARs developed >> internally). But that could be addressed through documentation on how to >> develop/deploy a bundle that required CL isolation. >> >> Cheers >> Oleg >> >>> On Mar 9, 2016, at 8:33 AM, Tony Kurc <[email protected]> wrote: >>> >>> Joe, >>> I had similar thoughts about the additional resource usage this adds, which >>> is why I thought exploring the downsides was apropos. I think one thing I >>> initially thought was that a lot of the "simple" core processors wouldn't >>> gain any advantage by doing this... but honestly, I'm not sure what is a >>> bigger problem, having a lot of processors and this chewing up resources, >>> or inability to put the combination of processors / configurations together >>> that you need. Could we have the best of both worlds by having a >>> classloaders per processor at the discretion of either the operator (or >>> maybe processor developer)? This adds more code complexity, for sure. >>> >>> I think I mentioned this before on pull request a long time ago, it sure is >>> hard to make decisions like this without having better metrics about who is >>> using what part of nifi. I wonder what a way of reporting finer grained >>> usage would look like in the apache world. >>> >>> >>>> On Wed, Mar 9, 2016 at 8:14 AM, Joe Witt <[email protected]> wrote: >>>> >>>> There are clear benefits to having the notion of extension isolation >>>> be as narrow as a single instance of that extension in the flow. >>>> However, there are also some important questions that must be >>>> answered. >>>> >>>> A quick one that comes to mind is the idea of a classloader per >>>> extension instance means the same classes will not only be added many >>>> times but they'll be added potentially an unbounded number of times as >>>> a given flow grows. Today the number of classloaders is bounded/set >>>> at startup. This is something we need good numbers on in terms of >>>> overhead/cost. We could address this by letting in hints that say we >>>> can collapse/share classloaders and so on. But then we also need to >>>> be careful how far we go with this. >>>> >>>> We could have chosen alternative componentization models long ago but >>>> held back due to complexity. We were happy to lose some of the power >>>> to ditch most of the complexity. Not saying this concept goes too far >>>> but saying we need to always figure out the right sweet spot and that >>>> inherently means when we have "enough capability vs complexity". >>>> >>>> Given the clearly very early stage of this discussion my personal >>>> preference is to see is it end up as a feature proposal/design doc on >>>> the wiki page with some of the others. We have seen over time that >>>> folks in the community not actively watching the mailing lists do >>>> notice those proposals and tend to bring them up or want to engage on >>>> them later. The Wiki/feature proposal section makes that easier. >>>> >>>> Thanks >>>> Joe >>>> >>>> On Wed, Mar 9, 2016 at 7:32 AM, Oleg Zhurakousky >>>> <[email protected]> wrote: >>>>> Well, sure there is the obvious; two instances of the same NAR = two >>>> instances of the same class in memory. But that’s a very small price to pay >>>> when realizing that current state of things can simply render NiFi >>>> un-usable. In fact we already had similar issue with HDFS processors ( >>>> https://issues.apache.org/jira/browse/NIFI-1536) that has UGI Kerberos >>>> code which uses the same static initializer model. I’ve patched it few >>>> weeks ago, but I must say it’s a true patch (a band-aid) to address an >>>> immediate problem, but the core issue is still there and could resurrect >>>> itself at any time. >>>>> >>>>> As for the complexity, sure, ClassLoaders are one of those areas in Java >>>> that is generally perceived as complex. I happen to navigate it with ease >>>> due to things I’ve done in the previous life, so I can help (with code and >>>> documentation to ensure its maintainable), but wanted to see what the >>>> general feel is. >>>>> >>>>> Keep in mind, the concept is not new. In fact I’d go as far as saying >>>> it’s pretty much a standard in server architectures. On top of that we kind >>>> of heading that direction anyway, since the minute we introduce Extension >>>> Registry with versioning, we bring on-demand deployment and at that time >>>> ClassLoader per instance will become the most natural and simple thing to >>>> do, so might as well start earlier. >>>>> >>>>> Cheers >>>>> Oleg >>>>> >>>>> >>>>>> On Mar 8, 2016, at 7:35 PM, Tony Kurc <[email protected]> wrote: >>>>>> >>>>>> Oleg, >>>>>> What do you think are the downsides of doing this? Memory usage? >>>> Additional >>>>>> complexity? >>>>>> >>>>>> Tony >>>>>> On Mar 8, 2016 9:54 AM, "Oleg Zhurakousky" < >>>> [email protected]> >>>>>> wrote: >>>>>> >>>>>>> Was wondering what others are thinking on the following: >>>>>>> >>>>>>> We have several components (Processors, ControllerServices etc.) both >>>>>>> existing and coming down the pipeline which rely on class-level >>>>>>> initializers (see example below from new SNMP PR) >>>>>>> SecurityModels.getInstance().addSecurityModel(usm); >>>>>>> While it’s a common pattern for certain types of use cases it doesn’t >>>> go >>>>>>> well with the flexibility we try to promote within NiFi. Specifically >>>> the >>>>>>> ability to have two different components that rely on such initializers >>>>>>> being different or in different states. This is because multiple >>>> instances >>>>>>> of the same component will be loaded by the same NAR ClassLoader and >>>> since >>>>>>> such initializers maintain the state at the class level (singleton), >>>> they >>>>>>> are shared across all instances of the component. So, the above example >>>>>>> will set security model for a processor where such security model was >>>>>>> required and it will immediately be available to another instance of >>>> the >>>>>>> same type processor where it may not be required or supported causing >>>> hard >>>>>>> to explain/debug errors. >>>>>>> >>>>>>> There is a simple ClassLoader trick that we can discuss and implement >>>> to >>>>>>> alleviate this (I’ve done it for another processor that is coming down >>>> the >>>>>>> pipeline), but first I would like to know what others think, since the >>>> more >>>>>>> I think about it the more I feel it is global concern and as such >>>> would be >>>>>>> better addressed at the framework level. >>>>>>> >>>>>>> Thoughts >>>>>>> >>>>>>> Oleg >> >
