Greg,

Again, I have to apologize.  You're right, the host:port in ZK are for the
cluster, not the NiFi UI.  Also, I was told those nodes in ZK are being
created by Curator, NiFi isn't explicitly creating them, so I'd be hesitant
to rely on that information.  The cluster node protocol is
production-stable, in my opinion, but it's not part of the public API.

I created a feature request JIRA to add the hostnames and UI ports of the
nodes in a NiFi cluster [1].

[1] https://issues.apache.org/jira/browse/NIFI-3237

On Tue, Dec 20, 2016 at 2:18 PM Hart, Greg <[email protected]>
wrote:

> Hi Jeff,
>
> I saw this and looked into it. The data in those nodes are
> the nifi.cluster.node.address and nifi.cluster.node.protocol.port values.
> In order to get the nifi.web.http.host and nifi.web.http.port values, it
> seems I would have to connect first using the cluster node protocol and
> pretend to be a NiFi node so that I can query the cluster coordinator for
> the list of NodeIdentifier objects. Is this cluster node protocol stable
> enough to use in a production application? It doesn’t seem to be documented
> anywhere so I was assuming it may change in a minor release without much
> notice.
>
> Thanks!
> -Greg
>
> From: Jeff <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Tuesday, December 20, 2016 at 11:10 AM
>
> To: "[email protected]" <[email protected]>
> Subject: Re: Load-balancing web api in cluster
>
> Greg,
>
> That first statement in my previous email should read "which nodes can be
> the primary or cluster coordinator".  I apologize for any confusion!
>
> - Jeff
>
> On Tue, Dec 20, 2016 at 2:04 PM Jeff <[email protected]> wrote:
>
> Greg,
>
> NiFi does store which nodes are the primary and coordinator.  Relevant
> nodes in ZK are (for instance, in a cluster I'm running locally):
> /nifi/leaders/Primary
> Node/_c_c94f1eb8-e5ac-443c-9643-2668b6f685b2-lock-0000000553,
> /nifi/leaders/Primary
> Node/_c_7cd14bd5-85f5-4ea9-b849-121496269ef4-lock-0000000554,
> /nifi/leaders/Primary
> Node/_c_99b79311-495f-4619-b316-9e842d445a8d-lock-0000000552,
> /nifi/leaders/Cluster
> Coordinator/_c_dc449a75-1a14-42d6-98ab-2cef3e74d616-lock-0000005967,
> /nifi/leaders/Cluster
> Coordinator/_c_2fbc68df-c9cd-4ecd-99d2-234b7b801110-lock-0000005966,
> /nifi/leaders/Cluster
> Coordinator/_c_a2b9c2be-c0fd-4bf7-a479-e011a7792fc3-lock-0000005968
>
> The data on each of these nodes should have the host:port.  These are the
> candidate nodes for being elected the Primary or Cluster Coordinator.  I
> don't think that the current active Primary and Cluster Coordinator is
> stored in ZK, just the nodes that are candidates to fulfill those roles.
> I'll have to get back to you on that for sure, though.
>
> - Jeff
>
> On Tue, Dec 20, 2016 at 1:45 PM Hart, Greg <
> [email protected]> wrote:
>
> Hi Jeff,
>
> My application communicates with the NiFi REST API to import templates,
> instantiate flows from templates, edit processor properties, and a few
> other things. I’m currently using Jersey to send calls to one NiFi node but
> if that node goes down then my application has to be manually reconfigured
> with the hostname and port of another NiFi node. HAProxy would handle
> failover but it still must be manually reconfigured when a NiFi node is
> added or removed from the cluster.
>
> I was hoping that NiFi would use ZooKeeper similarly to other applications
> (Hive or HBase) where a client can easily get the hostname and port of the
> cluster coordinator (or active master). Unfortunately, the information in
> ZooKeeper does not include the value of nifi.rest.http.host and
> nifi.rest.http.port of any NiFi nodes.
>
> It sounds like HAProxy might be the better solution for now. Luckily,
> adding or removing nodes from a cluster shouldn’t be a daily occurrence. If
> you have any other ideas please let me know.
>
> Thanks!
> -Greg
>
> From: Jeff <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Tuesday, December 20, 2016 at 8:56 AM
> To: "[email protected]" <[email protected]>
> Subject: Re: Load-balancing web api in cluster
>
> Hello Greg,
>
> You can use the REST API on any of the nodes in the cluster.  Could you
> provide more details on what you're trying to accomplish?  If, for
> instance, you are posting data to a ListenHTTP processor and you want to
> balance POSTs across the instances of ListenHTTP on your cluster, then
> haproxy would probably be a good idea.  If you're trying to distribute the
> processing load once the data is received, you can use a Remote Process
> Group to distribute the data across the cluster.  Pierre Villard has
> written a nice blog about setting up a cluster and configuring a flow using
> a Remote Process Group to distribute the processing load [1].  It details
> creating a Remote Process Group to send data back to an Input Port in the
> same NiFi cluster, and allows NiFi to distribute the processing load across
> all the nodes in your cluster.
>
> You can use a combination of haproxy and Remote Process Group to load
> balance connections to the REST API on each NiFi node and to balance the
> processing load across the cluster.
>
> [1] https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/
>
> - Jeff
>
> On Mon, Dec 19, 2016 at 9:25 PM Hart, Greg <
> [email protected]> wrote:
>
> Hi all,
>
> What¹s the recommended way for communicating with the NiFi REST API in a
> cluster? I see that NiFi uses ZooKeeper so is it possible to get the
> Cluster Coordinator hostname and API port from ZooKeeper, or should I use
> something like haproxy?
>
> Thanks!
> -Greg
>
>

Reply via email to