Hi Koji,

I moved onto another cluster of Nifi nodes , did the same configuration for
S2S there and boom.. the same error message all over the logs.(nothing on
the bulletin board)

Could it be because of the back pressure as i also get the  error
-(indicates that port 8c77c1b0-0164-1000-0000-0000052fa54c's destination is
full; penalizing peer) at the same time i see the closing connection error.
I don't see a way to resolve the back pressure as we get continue stream of
data from the kafka which is then inserted into Hbase( the slowest part of
the data flow) which eventually causes the back pressure.





On Fri, Jul 6, 2018 at 4:55 PM Koji Kawamura <[email protected]> wrote:

> Hi Faisai,
>
> I think both error messages indicating the same thing, that is network
> communication is closed in the middle of a Site-to-Site transaction.
> That can be happen due to many reasons, such as freaky network, or
> manually stop the port or RPG while some transaction is being
> processed. I don't think it is a configuration issue, because NiFi was
> able to initiate S2S communication.
>
> Thanks,
> Koji
>
> On Fri, Jul 6, 2018 at 4:16 PM, Faisal Durrani <[email protected]>
> wrote:
> > Hi Koji,
> >
> > In the subsequent tests the above error did not come but now we are
> getting
> > errors on the RPG :
> >
> > RemoteGroupPort[name=1_pk_ip,targets=
> http://xxxxxx.prod.xx.local:9090/nifi/]
> > failed to communicate with remote NiFi instance due to
> java.io.IOException:
> > Failed to confirm transaction with
> > Peer[url=nifi://xxx-xxxxx.prod.xx.local:5001] due to java.io.IOException:
> > Connection reset by peer
> >
> > The transport protocol is RAW while the URLs mentioned while setting up
> the
> > RPG is one of the node of the (4)node cluster.
> >
> > nifi.remote.input.socket.port = 5001
> >
> > nifi.remote.input.secure=false
> >
> > nifi.remote.input.http.transaction.ttl=60 sec
> >
> > nifi.remote.input.host=
> >
> > Please let me  know if there is any configuration changes that we need to
> > make.
> >
> >
> >
> >
> > On Fri, Jul 6, 2018 at 9:48 AM Faisal Durrani <[email protected]>
> wrote:
> >>
> >> Hi Koji ,
> >>
> >> Thank you for your reply. I updated the logback.xml and ran the test
> >> again. I can see an additional error in the app.log which is as below.
> >>
> >> o.a.nifi.remote.SocketRemoteSiteListener
> >> java.io.EOFException: null
> >>      at
> java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
> >>      at java.io.DataInputStream.readUTF(DataInputStream.java:589)
> >>      at java.io.DataInputStream.readUTF(DataInputStream.java:564)
> >>      at
> >>
> org.apache.nifi.remote.protocol.RequestType.readRequestType(RequestType.java:36)
> >>      at
> >>
> org.apache.nifi.remote.protocol.socket.SocketFlowFileServerProtocol.getRequestType(SocketFlowFileServerProtocol.java:147)
> >>      at
> >>
> org.apache.nifi.remote.SocketRemoteSiteListener$1$1.run(SocketRemoteSiteListener.java:253)
> >>      at java.lang.Thread.run(Thread.java:745)
> >>
> >>
> >> I notice this error is reported against not just one node but different
> >> nodes in the cluster. Would you be able infer the root cause of the
> issue
> >> from this information?
> >>
> >> Thanks.
> >>
> >> On Thu, Jul 5, 2018 at 3:34 PM Koji Kawamura <[email protected]>
> >> wrote:
> >>>
> >>> Hello,
> >>>
> >>> 1. The error message sounds like the client disconnects in the middle
> >>> of Site-to-Site communication. Enabling debug log would show more
> >>> information, by adding <logger name="org.apache.nifi.remote"
> >>> level="DEBUG"/> at conf/logback.xml.
> >>>
> >>> 2. I'd suggest checking if your 4 nodes receive data evenly (well
> >>> distributed). Connection status history, 'Queued Count' per node may
> >>> be useful to check. If not evenly distributed, I'd lower Remote Port
> >>> batch settings at sending side.
> >>> Then try to find a bottle neck in downstream flow. Increasing
> >>> concurrent tasks at such bottle neck processor can help increasing
> >>> throughput in some cases. Adding more node will also help.
> >>>
> >>> Thanks,
> >>> Koji
> >>>
> >>> On Thu, Jul 5, 2018 at 11:12 AM, Faisal Durrani <[email protected]>
> >>> wrote:
> >>> > Hi, I've got two questions
> >>> >
> >>> > 1.We are using Remote Process Group with Raw transport protocol to
> >>> > distribute the data across four node cluster. I see the nifi app log
> >>> > has a
> >>> > lot of instance of the below error
> >>> >
> >>> > o.a.nifi.remote.SocketRemoteSiteListener Unable to communicate with
> >>> > remote
> >>> > instance Peer[url=nifi://xxx-xxxxxx.prod.xx.:59528]
> >>> >
> >>> >
> (SocketFlowFileServerProtocol[CommsID=0bf887ed-acb3-4eea-94ac-5abf53ad0bf1])
> >>> > due to java.io.EOFException; closing connection
> >>> >
> >>> > These error do not show on the bulletin board and nor do I see any
> data
> >>> > loss. I was curious to know if there is some bad configuration that
> is
> >>> > causing this to happen.
> >>> >
> >>> > 2. The app log also has the below error
> >>> >
> >>> > o.a.n.r.c.socket.EndpointConnectionPool
> EndpointConnectionPool[Cluster
> >>> > URL=[http://xxx-xxxxxx.prod.xx.local:9090/nifi-api]]
> >>> > Peer[url=nifi://ins-btrananifi107z.prod.jp.local:5001] indicates that
> >>> > port
> >>> > 417e3d23-5b1a-1616-9728-9d9d1a462646's destination is full;
> penalizing
> >>> > peer
> >>> >
> >>> > The data flow consume a high volume data and there is back pressure
> on
> >>> > almost all the connections. So probably that is what causing it. I
> >>> > guess
> >>> > there isn't much we can do here and once the back pressure resolve
> ,the
> >>> > error goes away on its own.Please let me know of your view.
> >>> >
> >>> >
>

Reply via email to