Mark Payne created NIFI-9061:
--------------------------------
Summary: Improve responsiveness of the UI
Key: NIFI-9061
URL: https://issues.apache.org/jira/browse/NIFI-9061
Project: Apache NiFi
Issue Type: Improvement
Components: Core Framework, Core UI
Reporter: Mark Payne
Assignee: Mark Payne
When NiFi is deployed as a large cluster with many components, the UI often
starts to feel sluggish. Refreshing the stats can take many seconds. To find
out what the culprit was, I created a 10 node cluster. I then created a Process
Group and added 100 Processors to the group (20 sets of 5 with each of the 5
connected together).
I then refreshed the stats many times. Each refresh took several seconds.
I narrowed down the amount of time taken to 3 key elements:
- On the backend, the longest part of the request was merging responses from
all nodes by the Cluster Coordinator. The performance could be narrowed down to
the TimeAdapter that is used on many elements such as ProcessorStatus that is
used by Jackson to parse the timestamp and turn it into a Date object. This
uses a DateTimeFormatter but creates a new one for each invocation, which is
expensive. This can be cached & reused.
- There is a bug in ThreadPoolRequestReplicator. We have properties in
nifi.properties for {{nifi.cluster.node.protocol.threads}} and
{{nifi.cluster.node.protocol.max.threads}}. However, because of the way the
thread pool is used, we never actually scale beyond the value of the
{{nifi.cluster.node.protocol.threads}} property. By default, that means we
never use more than 10 threads. And if we have 10 node cluster, and each UI
refresh makes 4 requests, that's 40 requests that must be replicated (1 per
node). And those get queued up instead of the thread pool growing. We can
address this by dropping the {{nifi.cluster.node.protocol.threads}} property
and just scaling up to {{nifi.cluster.node.protocol.max.threads}} threads,
allowing scaling down to 0 if no active requests.
- The UI rendering is slow. Using Chrome's profiler, I find that, by far, the
largest amount of time rendering the canvas is spent in the nf-canvas-utils
ellipsis() method, determining whether or not ellipses are needed.
Specifically, the call to {{node.getSubStringLength()}} is very expensive and
is always called for each processor name, type, bundle, connection names, pg
name, etc. We can significantly improve this by keeping a cache of [text,
component width, text style] -> length to trim to before adding ellipsis (or a
-1 to indicate it should not be trimmed). This will eliminate a very large
proportion of these calls
--
This message was sent by Atlassian Jira
(v8.3.4#803005)