This might be related to NIFI-5110.  Since the StandardProcessScheduler is
keeping the reference through a StandardProcessorNode in the scheduleStates
map, the PutHDFS cannot be garbage collected and thus the
InstanceClassloader is not able to be cleaned up.

Note: It looks like between 1.5.0 and 1.6.0, the StandardProcessScheduler
change a little and scheduleStates was changed to lifecycleStates.

-Dann

On Fri, Apr 27, 2018 at 8:48 AM Dann <[email protected]> wrote:

> The behavior is the same with both 1.2.0 and 1.5.0.
>
> If the instance class loading is needed, there is something keeping that
> instance classloader from being garbage collected.  Removing instance class
> loading (if possible) would fix the problem.  If instance class loading is
> needed, then finding the reason that classloader can't be cleaned up would
> be the fix.
>
>
> On Fri, Apr 27, 2018 at 8:38 AM Bryan Bende <[email protected]> wrote:
>
>> Is the behavior the same in 1.5.0, or only in 1.2.0?
>>
>> We can't undo the instance class loading for Hadoop processors unless
>> we undo the static usage of UserGroupInformation.
>>
>> I think it is possible to go back to the non-static usage since we the
>> kerberos issues were actually resolved by the useSubjectCredsOnly
>> property, but it would just take a bit of work and testing and not as
>> easy as just removing the instance class loading annotation.
>>
>>
>> On Fri, Apr 27, 2018 at 10:25 AM, Joe Witt <[email protected]> wrote:
>> > Dann
>> >
>> > The hadoop processors as of a version around 1.2.0 switched to
>> > 'instance class loading' meaning that while we always have components
>> > from various nars in their own classloader in the case of Hadoop
>> > processors we went a step further and we create a new classloader for
>> > every single instance of the component.  This was done to overcome
>> > what appears to be problematic usage of static values in the hadoop
>> > client code.  However, it is possible we've changed our usage enough
>> > that these would not create problems for us any longer and we could
>> > consider going back to typical class loading.
>> >
>> > Would be good to hear others thoughts.
>> >
>> > Thanks
>> >
>> > On Fri, Apr 27, 2018 at 10:20 AM, Dann <[email protected]> wrote:
>> >> NiFi Versions: 1.2.0 and 1.5.0
>> >> Java Version: 1.8.0_162
>> >>
>> >> It appears to me that when I add a PutHDFS processor to the canvas, the
>> >> hadoop classes are loaded along with dependencies (normal/desired
>> behavior).
>> >> Then if I delete the PutHDFS processor, the garbage collection is not
>> able
>> >> to unload any of the classes that were loaded.  This is the same
>> behavior
>> >> with every PutHDFS processor that is added and then deleted.  The
>> results in
>> >> the slow degradation of NiFi over time with the way we use NiFi.
>> >>
>> >> Our usage of NiFi does not expose the NiFi interface for those that are
>> >> designing data flows.  We have a web interface that allows the user to
>> >> define the data they need and our application will build data flows
>> for the
>> >> user.  As we upgrade our application we make changes that require us to
>> >> remove the old data flow and rebuild it with the new changes.  Over
>> time, we
>> >> add and remove a lot of processors using the NiFi API and some of those
>> >> processors are HDFS related.
>> >>
>> >> To me, there seems to be a problem with the Hadoop dependency loading
>> and
>> >> processor implementation that doesn't allow the JVM to unload those
>> classes
>> >> when they are no longer needed.
>> >>
>> >> Thanks,
>> >>
>> >> Dann
>>
>

Reply via email to