Why not make UGI/Kerb stuff a Controller Service? Each CS instance can have 
class loader isolation and can be shared among processor instances that want to 
share UGI/Kerb/service config?

Sent from my iPhone

> On Mar 9, 2016, at 8:50 AM, Oleg Zhurakousky <[email protected]> 
> wrote:
> 
> Tony
> Interesting choice of words “at the discretion. . .”. That can certainly be a 
> happy medium as I do agree that 90% of them won’t need that. Further more, 
> the extensions that are developed/managed by NiFi we can apply “the 
> discretion” behind the scenes, thus not putting a burden on the DFM. For 
> example knowing that HDFS bundle uses UGI we can apply CL isolation on that 
> bundle but not others. The same would go for SNMP and others that we know
> Then it only leaves the once we can’t control (user NARs developed 
> internally). But that could be addressed through documentation on how to 
> develop/deploy a bundle that required CL isolation. 
> 
> Cheers
> Oleg
> 
>> On Mar 9, 2016, at 8:33 AM, Tony Kurc <[email protected]> wrote:
>> 
>> Joe,
>> I had similar thoughts about the additional resource usage this adds, which
>> is why I thought exploring the downsides was apropos. I think one thing I
>> initially thought was that a lot of the "simple" core processors wouldn't
>> gain any advantage by doing this... but honestly, I'm not sure what is a
>> bigger problem, having a lot of processors and this chewing up resources,
>> or inability to put the combination of processors / configurations together
>> that you need. Could we have the best of both worlds by having a
>> classloaders per processor at the discretion of either the operator (or
>> maybe processor developer)? This adds more code complexity, for sure.
>> 
>> I think I mentioned this before on pull request a long time ago, it sure is
>> hard to make decisions like this without having better metrics about who is
>> using what part of nifi. I wonder what a way of reporting finer grained
>> usage would look like in the apache world.
>> 
>> 
>>> On Wed, Mar 9, 2016 at 8:14 AM, Joe Witt <[email protected]> wrote:
>>> 
>>> There are clear benefits to having the notion of extension isolation
>>> be as narrow as a single instance of that extension in the flow.
>>> However, there are also some important questions that must be
>>> answered.
>>> 
>>> A quick one that comes to mind is the idea of a classloader per
>>> extension instance means the same classes will not only be added many
>>> times but they'll be added potentially an unbounded number of times as
>>> a given flow grows.  Today the number of classloaders is bounded/set
>>> at startup.  This is something we need good numbers on in terms of
>>> overhead/cost.  We could address this by letting in hints that say we
>>> can collapse/share classloaders and so on.  But then we also need to
>>> be careful how far we go with this.
>>> 
>>> We could have chosen alternative componentization models long ago but
>>> held back due to complexity.  We were happy to lose some of the power
>>> to ditch most of the complexity.  Not saying this concept goes too far
>>> but saying we need to always figure out the right sweet spot and that
>>> inherently means when we have "enough capability vs complexity".
>>> 
>>> Given the clearly very early stage of this discussion my personal
>>> preference is to see is it end up as a feature proposal/design doc on
>>> the wiki page with some of the others.  We have seen over time that
>>> folks in the community not actively watching the mailing lists do
>>> notice those proposals and tend to bring them up or want to engage on
>>> them later.  The Wiki/feature proposal section makes that easier.
>>> 
>>> Thanks
>>> Joe
>>> 
>>> On Wed, Mar 9, 2016 at 7:32 AM, Oleg Zhurakousky
>>> <[email protected]> wrote:
>>>> Well, sure there is the obvious; two instances of the same NAR = two
>>> instances of the same class in memory. But that’s a very small price to pay
>>> when realizing that current state of things can simply render NiFi
>>> un-usable. In fact we already had similar issue with HDFS processors (
>>> https://issues.apache.org/jira/browse/NIFI-1536) that has UGI Kerberos
>>> code which uses the same static initializer model. I’ve patched it few
>>> weeks ago, but I must say it’s a true patch (a band-aid) to address an
>>> immediate problem, but the core issue is still there and could resurrect
>>> itself at any time.
>>>> 
>>>> As for the complexity, sure, ClassLoaders are one of those areas in Java
>>> that is generally perceived as complex. I happen to navigate it with ease
>>> due to things I’ve done in the previous life, so I can help (with code and
>>> documentation to ensure its maintainable), but wanted to see what the
>>> general feel is.
>>>> 
>>>> Keep in mind, the concept is not new. In fact I’d go as far as saying
>>> it’s pretty much a standard in server architectures. On top of that we kind
>>> of heading that direction anyway, since the minute we introduce Extension
>>> Registry with versioning, we bring on-demand deployment and at that time
>>> ClassLoader per instance will become the most natural and simple thing to
>>> do, so might as well start earlier.
>>>> 
>>>> Cheers
>>>> Oleg
>>>> 
>>>> 
>>>>> On Mar 8, 2016, at 7:35 PM, Tony Kurc <[email protected]> wrote:
>>>>> 
>>>>> Oleg,
>>>>> What do you think are the downsides of doing this? Memory usage?
>>> Additional
>>>>> complexity?
>>>>> 
>>>>> Tony
>>>>> On Mar 8, 2016 9:54 AM, "Oleg Zhurakousky" <
>>> [email protected]>
>>>>> wrote:
>>>>> 
>>>>>> Was wondering what others are thinking on the following:
>>>>>> 
>>>>>> We have several components (Processors, ControllerServices etc.) both
>>>>>> existing and coming down the pipeline which rely on class-level
>>>>>> initializers (see example below from new SNMP PR)
>>>>>> SecurityModels.getInstance().addSecurityModel(usm);
>>>>>> While it’s a common pattern for certain types of use cases it doesn’t
>>> go
>>>>>> well with the flexibility we try to promote within NiFi. Specifically
>>> the
>>>>>> ability to have two different components that rely on such initializers
>>>>>> being different or in different states. This is because multiple
>>> instances
>>>>>> of the same component will be loaded by the same NAR ClassLoader and
>>> since
>>>>>> such initializers maintain the state at the class level (singleton),
>>> they
>>>>>> are shared across all instances of the component. So, the above example
>>>>>> will set security model for a processor where such security model was
>>>>>> required and it will immediately be available to another instance of
>>> the
>>>>>> same type processor where it may not be required or supported causing
>>> hard
>>>>>> to explain/debug errors.
>>>>>> 
>>>>>> There is a simple ClassLoader trick that we can discuss and implement
>>> to
>>>>>> alleviate this (I’ve done it for another processor that is coming down
>>> the
>>>>>> pipeline), but first I would like to know what others think, since the
>>> more
>>>>>> I think about it the more I feel it is global concern and as such
>>> would be
>>>>>> better addressed at the  framework level.
>>>>>> 
>>>>>> Thoughts
>>>>>> 
>>>>>> Oleg
> 

Reply via email to