Mark Payne created NIFI-9061:
--------------------------------

             Summary: Improve responsiveness of the UI
                 Key: NIFI-9061
                 URL: https://issues.apache.org/jira/browse/NIFI-9061
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Core Framework, Core UI
            Reporter: Mark Payne
            Assignee: Mark Payne


When NiFi is deployed as a large cluster with many components, the UI often 
starts to feel sluggish. Refreshing the stats can take many seconds. To find 
out what the culprit was, I created a 10 node cluster. I then created a Process 
Group and added 100 Processors to the group (20 sets of 5 with each of the 5 
connected together).

I then refreshed the stats many times. Each refresh took several seconds.
I narrowed down the amount of time taken to 3 key elements:

- On the backend, the longest part of the request was merging responses from 
all nodes by the Cluster Coordinator. The performance could be narrowed down to 
the TimeAdapter that is used on many elements such as ProcessorStatus that is 
used by Jackson to parse the timestamp and turn it into a Date object. This 
uses a DateTimeFormatter but creates a new one for each invocation, which is 
expensive. This can be cached & reused.

- There is a bug in ThreadPoolRequestReplicator. We have properties in 
nifi.properties for {{nifi.cluster.node.protocol.threads}} and 
{{nifi.cluster.node.protocol.max.threads}}. However, because of the way the 
thread pool is used, we never actually scale beyond the value of the 
{{nifi.cluster.node.protocol.threads}} property. By default, that means we 
never use more than 10 threads. And if we have 10 node cluster, and each UI 
refresh makes 4 requests, that's 40 requests that must be replicated (1 per 
node). And those get queued up instead of the thread pool growing. We can 
address this by dropping the {{nifi.cluster.node.protocol.threads}} property 
and just scaling up to {{nifi.cluster.node.protocol.max.threads}} threads, 
allowing scaling down to 0 if no active requests.

- The UI rendering is slow. Using Chrome's profiler, I find that, by far, the 
largest amount of time rendering the canvas is spent in the nf-canvas-utils 
ellipsis() method, determining whether or not ellipses are needed. 
Specifically, the call to {{node.getSubStringLength()}} is very expensive and 
is always called for each processor name, type, bundle, connection names, pg 
name, etc. We can significantly improve this by keeping a cache of [text, 
component width, text style] -> length to trim to before adding ellipsis (or a 
-1 to indicate it should not be trimmed). This will eliminate a very large 
proportion of these calls




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to