Hi Faisai, I think both error messages indicating the same thing, that is network communication is closed in the middle of a Site-to-Site transaction. That can be happen due to many reasons, such as freaky network, or manually stop the port or RPG while some transaction is being processed. I don't think it is a configuration issue, because NiFi was able to initiate S2S communication.
Thanks, Koji On Fri, Jul 6, 2018 at 4:16 PM, Faisal Durrani <[email protected]> wrote: > Hi Koji, > > In the subsequent tests the above error did not come but now we are getting > errors on the RPG : > > RemoteGroupPort[name=1_pk_ip,targets=http://xxxxxx.prod.xx.local:9090/nifi/] > failed to communicate with remote NiFi instance due to java.io.IOException: > Failed to confirm transaction with > Peer[url=nifi://xxx-xxxxx.prod.xx.local:5001] due to java.io.IOException: > Connection reset by peer > > The transport protocol is RAW while the URLs mentioned while setting up the > RPG is one of the node of the (4)node cluster. > > nifi.remote.input.socket.port = 5001 > > nifi.remote.input.secure=false > > nifi.remote.input.http.transaction.ttl=60 sec > > nifi.remote.input.host= > > Please let me know if there is any configuration changes that we need to > make. > > > > > On Fri, Jul 6, 2018 at 9:48 AM Faisal Durrani <[email protected]> wrote: >> >> Hi Koji , >> >> Thank you for your reply. I updated the logback.xml and ran the test >> again. I can see an additional error in the app.log which is as below. >> >> o.a.nifi.remote.SocketRemoteSiteListener >> java.io.EOFException: null >> at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) >> at java.io.DataInputStream.readUTF(DataInputStream.java:589) >> at java.io.DataInputStream.readUTF(DataInputStream.java:564) >> at >> org.apache.nifi.remote.protocol.RequestType.readRequestType(RequestType.java:36) >> at >> org.apache.nifi.remote.protocol.socket.SocketFlowFileServerProtocol.getRequestType(SocketFlowFileServerProtocol.java:147) >> at >> org.apache.nifi.remote.SocketRemoteSiteListener$1$1.run(SocketRemoteSiteListener.java:253) >> at java.lang.Thread.run(Thread.java:745) >> >> >> I notice this error is reported against not just one node but different >> nodes in the cluster. Would you be able infer the root cause of the issue >> from this information? >> >> Thanks. >> >> On Thu, Jul 5, 2018 at 3:34 PM Koji Kawamura <[email protected]> >> wrote: >>> >>> Hello, >>> >>> 1. The error message sounds like the client disconnects in the middle >>> of Site-to-Site communication. Enabling debug log would show more >>> information, by adding <logger name="org.apache.nifi.remote" >>> level="DEBUG"/> at conf/logback.xml. >>> >>> 2. I'd suggest checking if your 4 nodes receive data evenly (well >>> distributed). Connection status history, 'Queued Count' per node may >>> be useful to check. If not evenly distributed, I'd lower Remote Port >>> batch settings at sending side. >>> Then try to find a bottle neck in downstream flow. Increasing >>> concurrent tasks at such bottle neck processor can help increasing >>> throughput in some cases. Adding more node will also help. >>> >>> Thanks, >>> Koji >>> >>> On Thu, Jul 5, 2018 at 11:12 AM, Faisal Durrani <[email protected]> >>> wrote: >>> > Hi, I've got two questions >>> > >>> > 1.We are using Remote Process Group with Raw transport protocol to >>> > distribute the data across four node cluster. I see the nifi app log >>> > has a >>> > lot of instance of the below error >>> > >>> > o.a.nifi.remote.SocketRemoteSiteListener Unable to communicate with >>> > remote >>> > instance Peer[url=nifi://xxx-xxxxxx.prod.xx.:59528] >>> > >>> > (SocketFlowFileServerProtocol[CommsID=0bf887ed-acb3-4eea-94ac-5abf53ad0bf1]) >>> > due to java.io.EOFException; closing connection >>> > >>> > These error do not show on the bulletin board and nor do I see any data >>> > loss. I was curious to know if there is some bad configuration that is >>> > causing this to happen. >>> > >>> > 2. The app log also has the below error >>> > >>> > o.a.n.r.c.socket.EndpointConnectionPool EndpointConnectionPool[Cluster >>> > URL=[http://xxx-xxxxxx.prod.xx.local:9090/nifi-api]] >>> > Peer[url=nifi://ins-btrananifi107z.prod.jp.local:5001] indicates that >>> > port >>> > 417e3d23-5b1a-1616-9728-9d9d1a462646's destination is full; penalizing >>> > peer >>> > >>> > The data flow consume a high volume data and there is back pressure on >>> > almost all the connections. So probably that is what causing it. I >>> > guess >>> > there isn't much we can do here and once the back pressure resolve ,the >>> > error goes away on its own.Please let me know of your view. >>> > >>> >
