[jira] [Created] (NIFI-5480) Improve efficiency of how components are looked up by Identifier

Mark Payne (JIRA) Wed, 01 Aug 2018 14:23:45 -0700

Mark Payne created NIFI-5480:
--------------------------------

             Summary: Improve efficiency of how components are looked up by 
Identifier
                 Key: NIFI-5480
                 URL: https://issues.apache.org/jira/browse/NIFI-5480
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Core Framework
            Reporter: Mark Payne
            Assignee: Mark Payne



When we lookup a component by ID, we do so by obtaining the Root Process Group 
and then calling {{findLocalConnectable(String id)}}. This method obtains a 
read lock, then checks its map of Processors, its map of Input Ports, its map 
of Output Ports, and its map of Funnels. If no match is found, it then calls 
getRemoteProcessGroups() to iterate over each of those, looking for a Remote 
Input/Output Port with that ID. This call to {{getRemoteProcessGroups()}} 
creates a new {{HashSet}} that is then returned. If no match is found, we then 
call {{getProcessGroups()}} which also creates a new {{HashSet}} of 
ProcessGroup objects, and we iterate over those (recursively).

This means that for each call to lookup a component by ID, we have to create 
two {{HashSet}}s - for each Process Group on the canvas, until the component is 
found. Consider a flow that has a dozen Process Groups and several thousand 
Processors/ports/funnels. If we then click "Start" on the root group, we must 
create up to 24 {{HashSet}} objects and obtain 12 Read Locks. This is done for 
each component, so for 1,000 Processors it will create 24,000 {{HashSet}}s and 
obtain 12,000 Read Locks. Also, since this is a mutable request, this has to be 
done for both the first and second phase of the request, which results in a 
total of 48,000 {{HashSet}}s and 24,000 Read Locks being obtained.

Testing with 10,000 Processors I am seeing requests take well over 30 seconds 
to complete. All just to find a component by identifier. We can make this much 
more efficient.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (NIFI-5480) Improve efficiency of how components are looked up by Identifier

Reply via email to