Provenance query results in Node disconnect from cluster

Mark Bean Mon, 20 Nov 2017 12:26:04 -0800

We are seeing cases where a user attempts to query provenance on a cluster.
One or more Nodes may not respond to the request in a timely manner, and is
then subsequently disconnected from the cluster. The nifi-app.log shows log
messages similar to:


ThreadPoolRequestReplicator Failed to replicate request POST
/nifi-api/provenance to {host:port} due to
com.sun.jersy.api.client.ClientHandlerException:
java.net.SocketTimeoutException: Read timed out
NodeClusterCoordinator The following nodes failed to process URI
/nifi-api/provenance '{list of one or more nodes}'. Requesting each node
disconnect from cluster.

We have implemented a custom authorizer. For certain policies, additional
authorization checking is performed. Provenance is one such policy which
performs additional checking. It is surprising that the process is taking
so long as to time out the request. Currently, timeouts are set as:
nifi.cluster.node.read.timeout=10 sec
nifi.cluster.request.replication.claim.timeout=30 sec

This leads me to believe we are thread-limited, not CPU-limited.

In this scenario, what threads are involved? Would
nifi.cluster.node.protocol.threads (or .max.threads) be limiting the
processing of such api calls?

Is the api provenance request(s) limited by
nifi.provenance.repository.query.thread?

Are there other thread-related properties we should be looking at?

Are thread properties (such as nifi.provenance.repository.query.threads)
counted against the total threads given by nifi.web.jetty.threads?

Thanks,
Mark

Provenance query results in Node disconnect from cluster

Reply via email to