Koji Kawamura created NIFI-4376:
-----------------------------------
Summary: Expired requests stay in StansdardRootGroupPort request
queue and block new request to be offered
Key: NIFI-4376
URL: https://issues.apache.org/jira/browse/NIFI-4376
Project: Apache NiFi
Issue Type: Bug
Components: Core Framework
Affects Versions: 1.0.1
Reporter: Koji Kawamura
Assignee: Koji Kawamura
If the remote NiFi instance slows down for some reason, and S2S requests have
been timed out continuously, those timed out requests will remain in the
internal request queue at StandardRootGroupPort.
Once number of queued requests becomes 1,000, newly received request can not be
offered to the queue, and RequestExpiredException is thrown. A S2S client
receives following error in this case:
{code}
2017-09-11 16:32:38,137 ERROR [Timer-Driven Process Thread-4]
o.a.nifi.remote.StandardRemoteGroupPort
RemoteGroupPort[name=output,target=http://localhost:8080/nifi] failed to
communicate with http://localhost:8080/nifi due to java.io.IOException:
Unexpected response code: 500 errCode:Abort errMessage:<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 500 Server Error</title>
</head>
<body><h2>HTTP ERROR 500</h2>
<p>Problem accessing
/nifi-api/data-transfer/output-ports/6fc7fa6d-015e-1000-27cb-d118b1b0fe13/transactions/73da9bcf-87ec-4594-8e45-a41965c1ed82/flow-files.
Reason:
<pre> Server Error</pre></p><h3>Caused by:</h3><pre>java.io.IOException:
Failed to process the request.
at
org.apache.nifi.web.api.DataTransferResource$1.write(DataTransferResource.java:676)
at
com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:71)
...
<h3>Caused by:</h3><pre>org.apache.nifi.remote.exception.RequestExpiredException
at
org.apache.nifi.remote.StandardRootGroupPort.transferFlowFiles(StandardRootGroupPort.java:562)
at
org.apache.nifi.web.api.DataTransferResource$1.write(DataTransferResource.java:668)
{code}
Once a NiFi instance owning the root group port enters this state, it will not
be able to recover automatically as its client keeps retrying too often. And
the returned error message is not clear to understand what is happening.
The remote NiFi instance should remove already expired requests from its
internal queue, so that newly received requests are processed correctly, and
return 'read timeout' error response if it can not be processed and expired.
Clients can penalize the peer when it receives 'read timeout' error.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)