[jira] [Commented] (NIFI-5112) Inefficiency in replicating requests across cluster

Mark Payne (JIRA) Mon, 23 Apr 2018 12:53:36 -0700

    [ 
https://issues.apache.org/jira/browse/NIFI-5112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448772#comment-16448772
 ]


Mark Payne commented on NIFI-5112:
----------------------------------

I did not capture specific measurements. I did perform significant amounts of 
profiling and sampling in VisualVM as well as sampling and tracing in YourKit. 
When instantiating a template with 5,000 Processors spread over several Process 
Groups, I was able to see significant changes in the amounts of time spent in 
each of the above-mentioned tasks. The timing info found for replicating 
requests for Jersey Client and the Reflection Utils is given above. If memory 
serves, I saw several hundred milliseconds spent for the Authorizers calling 
Class.getMethod() on each request when I had the mentioned 5000 processors on 
my flow. Template Serialization was the bulk of the time for the Flow 
Serialization. Flow Serialization can be broken into two main sections re: the 
expensive operations: creating the DOM object and serializing the DOM object. 
We should be able to move the serialization out of the Read Lock entirely.

The most telling comparison, though, will be to run a multi-node cluster on a 
single machine. When I do this on my laptop i see that the UI is noticeably 
more sluggish than when running in a standalone node. I'd like a two-node 
cluster on my laptop to feel just as responsive (or at least very close) as a 
standalone node.

> Inefficiency in replicating requests across cluster
> ---------------------------------------------------
>
>                 Key: NIFI-5112
>                 URL: https://issues.apache.org/jira/browse/NIFI-5112
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>
> When replicating requests across the cluster, we do some things that are 
> rather inefficient, which can cause the UI to feel sluggish. Because all of 
> this is done while the UI awaits a response, we need to ensure that this area 
> of the application is very responsive. Through profiling and code review, I 
> have identified the following places where we can improve our efficiency:
>  * Use of Jersey Client. Jersey Client provides a very easy-to-use API that 
> is very powerful. It provides a lot of capabilities to scan class paths and 
> automatically detect interceptors, etc. However, doing this comes at a cost. 
> Profiling shows that, on average, on my laptop replicating a single request 
> took about 100 milliseconds, 100% of which was spent actually constructing 
> the Jersey objects. Less than 1 millisecond of time was spent writing the 
> message to the socket, awaiting the reply, and parsing the response. By using 
> a different client, we can significantly improve this.
>  * Flow Serialization holds a Flow Controller Read lock for the entire 
> duration. This means that we block any mutable operations, such as HTTP GET 
> requests, while we build the appropriate DOM object for the flow, transform 
> that DOM object into a String, and write that String to the output stream 
> (including compression). We should be able to hold the Read Lock only while 
> building the appropriate DOM object and then perform the 
> transformation/serialization outside of the lock.
>  * Template Serialization is inefficient. Currently, for each template, we 
> serialize the DTO object to a String, then Deserialize that String into a DOM 
> object (all of this is done in order to avoid XML-based injection attacks). 
> We then add that DOM object into our flow's DOM object. We should instead 
> hold onto/cache that DOM object so that we can cut out all of the above for 
> all but the first iteration.
>  * ReflectionUtils is used when a Processor is created in order to call any 
> method annotated with @OnAdded. The implementation uses some Spring-based 
> reflection utils in order to find any sort of Bridged methods. Doing this is 
> expensive (on the order of 1 ms on my laptop). While this may not sound like 
> a concern, that means that importing a template consisting of 5,000 
> processors will take 5 seconds just to find annotated methods. All within the 
> context of a web request. Since these methods will not change, we should 
> instead cache a list of Methods that contain the annotations so that we don't 
> have to constantly look these up.
>  * Authorization uses InovcationHandlers. These InvocationHandlers use 
> reflection to compare the method being called to a well-known method. The 
> call to Method.equals() is not expensive. However, the call to 
> Class.getMethod() is expensive and is done for every single authorization 
> check, which can amount to a significant amount of time being spent. Instead, 
> we can store the method of interest in a member variable and reference that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (NIFI-5112) Inefficiency in replicating requests across cluster

Reply via email to