Re: [DISCUSS] - Class loader isolation per instance of the component

Oleg Zhurakousky Wed, 09 Mar 2016 06:07:13 -0800

Matt

We are (I am) working on Kerberos CS, but keep in mind that I am using Kerberos 
and UGI only as an example. There are many other cases like this. The bottom 
line is we need a mechanism to provide CL isolation per instance when it’s due.


Oleg

> On Mar 9, 2016, at 8:54 AM, Matt Burgess <[email protected]> wrote:
> 
> Why not make UGI/Kerb stuff a Controller Service? Each CS instance can have 
> class loader isolation and can be shared among processor instances that want 
> to share UGI/Kerb/service config?
> 
> Sent from my iPhone
> 
>> On Mar 9, 2016, at 8:50 AM, Oleg Zhurakousky <[email protected]> 
>> wrote:
>> 
>> Tony
>> Interesting choice of words “at the discretion. . .”. That can certainly be 
>> a happy medium as I do agree that 90% of them won’t need that. Further more, 
>> the extensions that are developed/managed by NiFi we can apply “the 
>> discretion” behind the scenes, thus not putting a burden on the DFM. For 
>> example knowing that HDFS bundle uses UGI we can apply CL isolation on that 
>> bundle but not others. The same would go for SNMP and others that we know
>> Then it only leaves the once we can’t control (user NARs developed 
>> internally). But that could be addressed through documentation on how to 
>> develop/deploy a bundle that required CL isolation. 
>> 
>> Cheers
>> Oleg
>> 
>>> On Mar 9, 2016, at 8:33 AM, Tony Kurc <[email protected]> wrote:
>>> 
>>> Joe,
>>> I had similar thoughts about the additional resource usage this adds, which
>>> is why I thought exploring the downsides was apropos. I think one thing I
>>> initially thought was that a lot of the "simple" core processors wouldn't
>>> gain any advantage by doing this... but honestly, I'm not sure what is a
>>> bigger problem, having a lot of processors and this chewing up resources,
>>> or inability to put the combination of processors / configurations together
>>> that you need. Could we have the best of both worlds by having a
>>> classloaders per processor at the discretion of either the operator (or
>>> maybe processor developer)? This adds more code complexity, for sure.
>>> 
>>> I think I mentioned this before on pull request a long time ago, it sure is
>>> hard to make decisions like this without having better metrics about who is
>>> using what part of nifi. I wonder what a way of reporting finer grained
>>> usage would look like in the apache world.
>>> 
>>> 
>>>> On Wed, Mar 9, 2016 at 8:14 AM, Joe Witt <[email protected]> wrote:
>>>> 
>>>> There are clear benefits to having the notion of extension isolation
>>>> be as narrow as a single instance of that extension in the flow.
>>>> However, there are also some important questions that must be
>>>> answered.
>>>> 
>>>> A quick one that comes to mind is the idea of a classloader per
>>>> extension instance means the same classes will not only be added many
>>>> times but they'll be added potentially an unbounded number of times as
>>>> a given flow grows.  Today the number of classloaders is bounded/set
>>>> at startup.  This is something we need good numbers on in terms of
>>>> overhead/cost.  We could address this by letting in hints that say we
>>>> can collapse/share classloaders and so on.  But then we also need to
>>>> be careful how far we go with this.
>>>> 
>>>> We could have chosen alternative componentization models long ago but
>>>> held back due to complexity.  We were happy to lose some of the power
>>>> to ditch most of the complexity.  Not saying this concept goes too far
>>>> but saying we need to always figure out the right sweet spot and that
>>>> inherently means when we have "enough capability vs complexity".
>>>> 
>>>> Given the clearly very early stage of this discussion my personal
>>>> preference is to see is it end up as a feature proposal/design doc on
>>>> the wiki page with some of the others.  We have seen over time that
>>>> folks in the community not actively watching the mailing lists do
>>>> notice those proposals and tend to bring them up or want to engage on
>>>> them later.  The Wiki/feature proposal section makes that easier.
>>>> 
>>>> Thanks
>>>> Joe
>>>> 
>>>> On Wed, Mar 9, 2016 at 7:32 AM, Oleg Zhurakousky
>>>> <[email protected]> wrote:
>>>>> Well, sure there is the obvious; two instances of the same NAR = two
>>>> instances of the same class in memory. But that’s a very small price to pay
>>>> when realizing that current state of things can simply render NiFi
>>>> un-usable. In fact we already had similar issue with HDFS processors (
>>>> https://issues.apache.org/jira/browse/NIFI-1536) that has UGI Kerberos
>>>> code which uses the same static initializer model. I’ve patched it few
>>>> weeks ago, but I must say it’s a true patch (a band-aid) to address an
>>>> immediate problem, but the core issue is still there and could resurrect
>>>> itself at any time.
>>>>> 
>>>>> As for the complexity, sure, ClassLoaders are one of those areas in Java
>>>> that is generally perceived as complex. I happen to navigate it with ease
>>>> due to things I’ve done in the previous life, so I can help (with code and
>>>> documentation to ensure its maintainable), but wanted to see what the
>>>> general feel is.
>>>>> 
>>>>> Keep in mind, the concept is not new. In fact I’d go as far as saying
>>>> it’s pretty much a standard in server architectures. On top of that we kind
>>>> of heading that direction anyway, since the minute we introduce Extension
>>>> Registry with versioning, we bring on-demand deployment and at that time
>>>> ClassLoader per instance will become the most natural and simple thing to
>>>> do, so might as well start earlier.
>>>>> 
>>>>> Cheers
>>>>> Oleg
>>>>> 
>>>>> 
>>>>>> On Mar 8, 2016, at 7:35 PM, Tony Kurc <[email protected]> wrote:
>>>>>> 
>>>>>> Oleg,
>>>>>> What do you think are the downsides of doing this? Memory usage?
>>>> Additional
>>>>>> complexity?
>>>>>> 
>>>>>> Tony
>>>>>> On Mar 8, 2016 9:54 AM, "Oleg Zhurakousky" <
>>>> [email protected]>
>>>>>> wrote:
>>>>>> 
>>>>>>> Was wondering what others are thinking on the following:
>>>>>>> 
>>>>>>> We have several components (Processors, ControllerServices etc.) both
>>>>>>> existing and coming down the pipeline which rely on class-level
>>>>>>> initializers (see example below from new SNMP PR)
>>>>>>> SecurityModels.getInstance().addSecurityModel(usm);
>>>>>>> While it’s a common pattern for certain types of use cases it doesn’t
>>>> go
>>>>>>> well with the flexibility we try to promote within NiFi. Specifically
>>>> the
>>>>>>> ability to have two different components that rely on such initializers
>>>>>>> being different or in different states. This is because multiple
>>>> instances
>>>>>>> of the same component will be loaded by the same NAR ClassLoader and
>>>> since
>>>>>>> such initializers maintain the state at the class level (singleton),
>>>> they
>>>>>>> are shared across all instances of the component. So, the above example
>>>>>>> will set security model for a processor where such security model was
>>>>>>> required and it will immediately be available to another instance of
>>>> the
>>>>>>> same type processor where it may not be required or supported causing
>>>> hard
>>>>>>> to explain/debug errors.
>>>>>>> 
>>>>>>> There is a simple ClassLoader trick that we can discuss and implement
>>>> to
>>>>>>> alleviate this (I’ve done it for another processor that is coming down
>>>> the
>>>>>>> pipeline), but first I would like to know what others think, since the
>>>> more
>>>>>>> I think about it the more I feel it is global concern and as such
>>>> would be
>>>>>>> better addressed at the  framework level.
>>>>>>> 
>>>>>>> Thoughts
>>>>>>> 
>>>>>>> Oleg
>> 
>

Re: [DISCUSS] - Class loader isolation per instance of the component

Reply via email to