Hey Dann,

I've spent most of the afternoon debugging this and you are correct,
there are actually a couple of issues that are causing the
InstanceClassLoader to stick around. One of them is what you mentioned
with the scheduled states (now called lifecycle states), nice find!!!

I have a few them of fixed here [1], but there is one more issue I'm
still looking into that I will have to come back to on Monday.

The one you might find most interesting is that the Hadoop FileSystem
class starts a thread that it never stops and provides no access to:

https://github.com/apache/hadoop/blob/release-3.0.0-RC1/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L3655-L3665

I was able to use reflection to make it accessible and access the
thread and call interrupt and get it to stop.

I'll report back once I know more about the last issue I was looking into.

Thanks,

Bryan

[1] https://github.com/bbende/nifi/commits/stop-leaking-processors


On Fri, Apr 27, 2018 at 2:30 PM, Dann <[email protected]> wrote:
> This might be related to NIFI-5110.  Since the StandardProcessScheduler is
> keeping the reference through a StandardProcessorNode in the scheduleStates
> map, the PutHDFS cannot be garbage collected and thus the
> InstanceClassloader is not able to be cleaned up.
>
> Note: It looks like between 1.5.0 and 1.6.0, the StandardProcessScheduler
> change a little and scheduleStates was changed to lifecycleStates.
>
> -Dann
>
> On Fri, Apr 27, 2018 at 8:48 AM Dann <[email protected]> wrote:
>>
>> The behavior is the same with both 1.2.0 and 1.5.0.
>>
>> If the instance class loading is needed, there is something keeping that
>> instance classloader from being garbage collected.  Removing instance class
>> loading (if possible) would fix the problem.  If instance class loading is
>> needed, then finding the reason that classloader can't be cleaned up would
>> be the fix.
>>
>>
>> On Fri, Apr 27, 2018 at 8:38 AM Bryan Bende <[email protected]> wrote:
>>>
>>> Is the behavior the same in 1.5.0, or only in 1.2.0?
>>>
>>> We can't undo the instance class loading for Hadoop processors unless
>>> we undo the static usage of UserGroupInformation.
>>>
>>> I think it is possible to go back to the non-static usage since we the
>>> kerberos issues were actually resolved by the useSubjectCredsOnly
>>> property, but it would just take a bit of work and testing and not as
>>> easy as just removing the instance class loading annotation.
>>>
>>>
>>> On Fri, Apr 27, 2018 at 10:25 AM, Joe Witt <[email protected]> wrote:
>>> > Dann
>>> >
>>> > The hadoop processors as of a version around 1.2.0 switched to
>>> > 'instance class loading' meaning that while we always have components
>>> > from various nars in their own classloader in the case of Hadoop
>>> > processors we went a step further and we create a new classloader for
>>> > every single instance of the component.  This was done to overcome
>>> > what appears to be problematic usage of static values in the hadoop
>>> > client code.  However, it is possible we've changed our usage enough
>>> > that these would not create problems for us any longer and we could
>>> > consider going back to typical class loading.
>>> >
>>> > Would be good to hear others thoughts.
>>> >
>>> > Thanks
>>> >
>>> > On Fri, Apr 27, 2018 at 10:20 AM, Dann <[email protected]> wrote:
>>> >> NiFi Versions: 1.2.0 and 1.5.0
>>> >> Java Version: 1.8.0_162
>>> >>
>>> >> It appears to me that when I add a PutHDFS processor to the canvas,
>>> >> the
>>> >> hadoop classes are loaded along with dependencies (normal/desired
>>> >> behavior).
>>> >> Then if I delete the PutHDFS processor, the garbage collection is not
>>> >> able
>>> >> to unload any of the classes that were loaded.  This is the same
>>> >> behavior
>>> >> with every PutHDFS processor that is added and then deleted.  The
>>> >> results in
>>> >> the slow degradation of NiFi over time with the way we use NiFi.
>>> >>
>>> >> Our usage of NiFi does not expose the NiFi interface for those that
>>> >> are
>>> >> designing data flows.  We have a web interface that allows the user to
>>> >> define the data they need and our application will build data flows
>>> >> for the
>>> >> user.  As we upgrade our application we make changes that require us
>>> >> to
>>> >> remove the old data flow and rebuild it with the new changes.  Over
>>> >> time, we
>>> >> add and remove a lot of processors using the NiFi API and some of
>>> >> those
>>> >> processors are HDFS related.
>>> >>
>>> >> To me, there seems to be a problem with the Hadoop dependency loading
>>> >> and
>>> >> processor implementation that doesn't allow the JVM to unload those
>>> >> classes
>>> >> when they are no longer needed.
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Dann

Reply via email to