Jeff Storck created NIFI-2699:
---------------------------------
Summary: Improve handling of response timeouts in cluster
Key: NIFI-2699
URL: https://issues.apache.org/jira/browse/NIFI-2699
Project: Apache NiFi
Issue Type: Improvement
Components: Core Framework, Core UI
Reporter: Jeff Storck
Priority: Minor
Fix For: 1.1.0
When running as a cluster, if a node is unable to respond within the socket
timeout (eg, hitting a breakpoint while debugging), an
IllegalClusterStateException will be thrown that causes the UI to show the
"check config and fix errors" page. Once the node is communicating witht he
cluster again (i.e., breakpoint in the code is passed), the UI can be reloaded
and the cluster recovers from the timeout without any user intervention at the
service level. However, user experience could be improved. If a user initiates
a replicated request to a node that is unable to respond within the socket
timeout duration, the user might think NiFi crashed, when it in fact didn't.
Here is the stack trace that was encountered during testing:
{code}
2016-08-29 11:36:59,041 DEBUG [NiFi Web Server-22]
o.a.n.w.a.c.IllegalClusterStateExceptionMapper
org.apache.nifi.cluster.manager.exception.IllegalClusterStateException: Node
localhost:8443 is unable to fulfill this request due to: Unexpected Response
Code 500
at
org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator$2.onCompletion(ThreadPoolRequestReplicator.java:471)
~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
at
org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator$NodeHttpRequest.run(ThreadPoolRequestReplicator.java:729)
~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[na:1.8.0_92]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[na:1.8.0_92]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_92]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
~[na:1.8.0_92]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
Caused by: com.sun.jersey.api.client.ClientHandlerException:
java.net.SocketTimeoutException: Read timed out
at
com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
~[jersey-client-1.19.jar:1.19]
at com.sun.jersey.api.client.Client.handle(Client.java:652)
~[jersey-client-1.19.jar:1.19]
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
~[jersey-client-1.19.jar:1.19]
at
com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
~[jersey-client-1.19.jar:1.19]
at
com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:560)
~[jersey-client-1.19.jar:1.19]
at
org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator.replicateRequest(ThreadPoolRequestReplicator.java:537)
~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
at
org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator$NodeHttpRequest.run(ThreadPoolRequestReplicator.java:720)
~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
... 5 common frames omitted
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.8.0_92]
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
~[na:1.8.0_92]
at java.net.SocketInputStream.read(SocketInputStream.java:170)
~[na:1.8.0_92]
at java.net.SocketInputStream.read(SocketInputStream.java:141)
~[na:1.8.0_92]
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
~[na:1.8.0_92]
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
~[na:1.8.0_92]
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
~[na:1.8.0_92]
at
sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
~[na:1.8.0_92]
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
~[na:1.8.0_92]
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
~[na:1.8.0_92]
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
~[na:1.8.0_92]
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
~[na:1.8.0_92]
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)
~[na:1.8.0_92]
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)
~[na:1.8.0_92]
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536)
~[na:1.8.0_92]
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
~[na:1.8.0_92]
at
java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
~[na:1.8.0_92]
at
sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:338)
~[na:1.8.0_92]
at
com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
~[jersey-client-1.19.jar:1.19]
at
com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
~[jersey-client-1.19.jar:1.19]
... 11 common frames omitted
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)