Hi Pradeep,

Thanks for sending the information about your flow. From looking at the
picture, I'm not sure I fully understand what the flow is doing...

It looks like there is a GetFile processor connected to a Remote Process
Group (RPG) which goes to an Input Port called app_logs, and I see the
app_logs Input Port on the graph then connected to the Output Port
Data_for_Spark. Then the RPG has a connection from Data_for_Spark to
Data_for_Spark which seems like its a circular connection.

Do you even need a Remote Process Group in this case?

You could have GetFile directly connected to an Output Port where Spark can
pull from. The only reason to use a Remote Process Group would be if you
are in a cluster and want to run GetFile only on the Primary Node, and then
redistributed the results to the other nodes in the cluster. If that is
what you want to do then using ListFile + FetchFile would be the
recommended approach.

ListFile -> Remote Process Group
Input Port -> FetchFile

Hope this helps.

-Bryan


On Wed, May 25, 2016 at 1:32 PM, pradeepbill <[email protected]> wrote:

> Please see my dataflow .
>
> nifi.PNG
> <
> http://apache-nifi-developer-list.39713.n7.nabble.com/file/n10710/nifi.PNG
> >
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/site-to-site-communication-error-on-output-port-tp10698p10710.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>

Reply via email to