Mark,

By and large, when you run into issues with timeouts on cluster replication, in 
my experience, the culprit
is usually Garbage Collection. So it may be that you are not thread-limited or 
CPU-limited,
or resource limited at all, just that garbage collection is kicking in at an 
inopportune time. In such a situation,
my suggestion would be to use a nifi.cluster.node.read.timeout of say 30 
seconds instead of 10, and to
look into how the garbage collection is performing on your system.

I have answered specific questions below, though, in case they are helpful.

Thanks
-Mark


> On Nov 20, 2017, at 3:25 PM, Mark Bean <[email protected]> wrote:
> 
> We are seeing cases where a user attempts to query provenance on a cluster.
> One or more Nodes may not respond to the request in a timely manner, and is
> then subsequently disconnected from the cluster. The nifi-app.log shows log
> messages similar to:
> 
> ThreadPoolRequestReplicator Failed to replicate request POST
> /nifi-api/provenance to {host:port} due to
> com.sun.jersy.api.client.ClientHandlerException:
> java.net.SocketTimeoutException: Read timed out
> NodeClusterCoordinator The following nodes failed to process URI
> /nifi-api/provenance '{list of one or more nodes}'. Requesting each node
> disconnect from cluster.
> 
> We have implemented a custom authorizer. For certain policies, additional
> authorization checking is performed. Provenance is one such policy which
> performs additional checking. It is surprising that the process is taking
> so long as to time out the request. Currently, timeouts are set as:
> nifi.cluster.node.read.timeout=10 sec
> nifi.cluster.request.replication.claim.timeout=30 sec
> 
> This leads me to believe we are thread-limited, not CPU-limited.
> 
> In this scenario, what threads are involved? Would
> nifi.cluster.node.protocol.threads (or .max.threads) be limiting the
> processing of such api calls?

>>> These are the jetty threads that are involved, on the 'receiving' side
and the nifi.cluster.node.protocol.threads on the client side

> 
> Is the api provenance request(s) limited by
> nifi.provenance.repository.query.thread?

>>> These query threads are background threads that are used to populate the 
>>> results
of the query. Client requests will not block on those results.

> 
> Are there other thread-related properties we should be looking at?
> 

>>> I don't think so. I can't think of any off of the top of my head, anyway.

> Are thread properties (such as nifi.provenance.repository.query.threads)
> counted against the total threads given by nifi.web.jetty.threads?

>>> No, these are separate thread pools.

> 
> Thanks,
> Mark

Reply via email to