Paresh,

Mark can correct me if I'm wrong, but I believe the information
fetched in step 1 is persisted in-memory on each node where the RPG is
running. This information is then periodically refreshed in a
background thread.

When data is flowing through it is distributing the data to the nodes
in a round robin manner in batches according to the batch size
configuration. If it knows a node is down, I believe it will not send
any data to that node until it is back up, and if it thinks the node
is up, but it fails to send data to it, then it will try another node.

The URL in the RPG should accept a comma-separated list of multiple
URLs, but as Mark mentioned this would only be used the first time you
start the RPG, of if a node restarted. For example, say you entered
the URL as "node1,node2" and then node 3 restarts while node1 is down,
it would try node1 to get cluster info and fail, then try node2 and
succeed.

-Bryan


On Mon, Jun 4, 2018 at 1:24 PM, Paresh Shah <[email protected]> wrote:
> Mark
>
> Want some more clarity. Let me see if I understand this. Just to be clear we 
> are using RPG purely for load balancing on the same cluster.
>
> Step 1: When it initially connects to Node1, it would fetch all the cluster 
> details i.e it would know all the nodes that exist in my cluster which are
> Node1
> Node2
> Node3
>
> Question: Where is this information persisted and does it resolve this every 
> time there are flow files sent to the RPG( remote process group ).
>
> Step 2: Now when Node1 goes down it would try to establish communication with 
> one of the following nodes which it had retrieved and stored initially or as 
> a background task.
> Node2
> Node3
>
> Question: Does this update the persisted information in Step1. Is there any 
> way to update the actual URL for the RPG. Basically we do not want every 
> incoming flow file on the RPG to end up selecting the target node.
>
> Thanks
> Paresh
>
> On 6/3/18, 9:09 AM, "Mark Payne" <[email protected]> wrote:
>
>     Paresh,
>
>     When NiFi establishes a connection to the remote instance, it will 
> request information from the remote instance about all nodes in the cluster. 
> It then persists this information in case nifi is restarted. So whichever 
> node you use in your URL is only important for the initial connection. 
> Additionally, NiFi will periodically reach out to the remote nifi instances 
> to determine which nodes are in the cluster, in case nodes are added to or 
> removed from the cluster.
>
>     Does that all make sense?
>
>     Thanks
>     -Mark
>
>     Sent from my iPhone
>
>     > On Jun 3, 2018, at 11:15 AM, Paresh Shah <[email protected]> 
> wrote:
>     >
>     > I have a cluster with 3 nodes. We are using RPG for load balancing
>     >
>     > Node1 ( primary and cluster coordinator ).
>     > Node2
>     > Node3
>     >
>     > When configuring the RPG is use Node1 as the target URL. My question is 
> what happens to this RPG when the Node1 goes down or is offline. At this 
> point how does the RPG keep functioning, since we cannot update the URL once 
> its created.
>     >
>     > Thanks
>     > Paresh
>     >
>
>

Reply via email to