Hi Koji, In the subsequent tests the above error did not come but now we are getting errors on the RPG :
RemoteGroupPort[name=1_pk_ip,targets=http://xxxxxx.prod.xx.local:9090/nifi/] failed to communicate with remote NiFi instance due to java.io.IOException: Failed to confirm transaction with Peer[url=nifi://xxx-xxxxx.prod.xx.local:5001] due to java.io.IOException: Connection reset by peer The transport protocol is RAW while the URLs mentioned while setting up the RPG is one of the node of the (4)node cluster. nifi.remote.input.socket.port = 5001 nifi.remote.input.secure=false nifi.remote.input.http.transaction.ttl=60 sec nifi.remote.input.host= Please let me know if there is any configuration changes that we need to make. On Fri, Jul 6, 2018 at 9:48 AM Faisal Durrani <[email protected]> wrote: > Hi Koji , > > Thank you for your reply. I updated the logback.xml and ran the test > again. I can see an additional error in the app.log which is as below. > > o.a.nifi.remote.SocketRemoteSiteListener > java.io.EOFException: null > at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) > at java.io.DataInputStream.readUTF(DataInputStream.java:589) > at java.io.DataInputStream.readUTF(DataInputStream.java:564) > at > org.apache.nifi.remote.protocol.RequestType.readRequestType(RequestType.java:36) > at > org.apache.nifi.remote.protocol.socket.SocketFlowFileServerProtocol.getRequestType(SocketFlowFileServerProtocol.java:147) > at > org.apache.nifi.remote.SocketRemoteSiteListener$1$1.run(SocketRemoteSiteListener.java:253) > at java.lang.Thread.run(Thread.java:745) > > > I notice this error is reported against not just one node but different > nodes in the cluster. Would you be able infer the root cause of the issue > from this information? > > Thanks. > > On Thu, Jul 5, 2018 at 3:34 PM Koji Kawamura <[email protected]> > wrote: > >> Hello, >> >> 1. The error message sounds like the client disconnects in the middle >> of Site-to-Site communication. Enabling debug log would show more >> information, by adding <logger name="org.apache.nifi.remote" >> level="DEBUG"/> at conf/logback.xml. >> >> 2. I'd suggest checking if your 4 nodes receive data evenly (well >> distributed). Connection status history, 'Queued Count' per node may >> be useful to check. If not evenly distributed, I'd lower Remote Port >> batch settings at sending side. >> Then try to find a bottle neck in downstream flow. Increasing >> concurrent tasks at such bottle neck processor can help increasing >> throughput in some cases. Adding more node will also help. >> >> Thanks, >> Koji >> >> On Thu, Jul 5, 2018 at 11:12 AM, Faisal Durrani <[email protected]> >> wrote: >> > Hi, I've got two questions >> > >> > 1.We are using Remote Process Group with Raw transport protocol to >> > distribute the data across four node cluster. I see the nifi app log >> has a >> > lot of instance of the below error >> > >> > o.a.nifi.remote.SocketRemoteSiteListener Unable to communicate with >> remote >> > instance Peer[url=nifi://xxx-xxxxxx.prod.xx.:59528] >> > >> (SocketFlowFileServerProtocol[CommsID=0bf887ed-acb3-4eea-94ac-5abf53ad0bf1]) >> > due to java.io.EOFException; closing connection >> > >> > These error do not show on the bulletin board and nor do I see any data >> > loss. I was curious to know if there is some bad configuration that is >> > causing this to happen. >> > >> > 2. The app log also has the below error >> > >> > o.a.n.r.c.socket.EndpointConnectionPool EndpointConnectionPool[Cluster >> > URL=[http://xxx-xxxxxx.prod.xx.local:9090/nifi-api]] >> > Peer[url=nifi://ins-btrananifi107z.prod.jp.local:5001] indicates that >> port >> > 417e3d23-5b1a-1616-9728-9d9d1a462646's destination is full; penalizing >> peer >> > >> > The data flow consume a high volume data and there is back pressure on >> > almost all the connections. So probably that is what causing it. I guess >> > there isn't much we can do here and once the back pressure resolve ,the >> > error goes away on its own.Please let me know of your view. >> > >> > >> >
