[
https://issues.apache.org/jira/browse/NIFI-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447597#comment-17447597
]
ASF subversion and git services commented on NIFI-9382:
-------------------------------------------------------
Commit 839fbf7d19a428069355d7bf79b8df7fa68b30a3 in nifi's branch
refs/heads/main from markap14
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=839fbf7 ]
NIFI-9382: Created a new ClassloaderIsolationKey mechanism by which H… (#5533)
* NIFI-9382: Created a new ClassloaderIsolationKey mechanism by which Hadoop
related processors (and potentially others) can indicate that they need full
classloaders to be cloned but can share with other instances in certain
circumstances
- Added system tests
* NIFI-9382: Renamed interface based on review feedback
* NIFI-9382: Removed ReentrantKerberosUser.
> Improve startup time when loading flow that uses many HDFS related processors
> -----------------------------------------------------------------------------
>
> Key: NIFI-9382
> URL: https://issues.apache.org/jira/browse/NIFI-9382
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework, Extensions
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Major
> Fix For: 1.16.0
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> When starting NiFI, if a flow has many HDFS related processors (hundreds to
> thousands) the startup time can be very long. In one case, I have a user flow
> that has > 1000 HDFS processors and it takes 1-2 hours to fully start NiFi.
> This is because the HDFS makes a lot of assumptions about the environment
> that it's running in. These assumptions are not always true, unfortunately,
> when running in NiFi. The use of static methods in the UserGroupInformation
> class means that in order to interact with an HDFS cluster using multiple
> Kerberos Principals, we have to create ClassLoader isolation, using a
> separate, duplicate ClassLoader for each HDFS processor.
> Because of this, the HDFS client components must be initialized once for each
> processor, and the initialization of the client is very expensive. We need to
> improve this so that we don't create a separate ClassLoader that loads
> hundreds or thousands of classes for each instance of the Processor.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)