Thank you very much, Joe. Do we have any recommendation regarding the maximum throughput a single input port can achieve to understand when we need to upgrade to having multiple input ports? Will it be a bottleneck at all? Or before hitting that we probably already hit other bottlenecks?
Is there any document/article I can read regarding how site-to-site works at the low level? Regards, Ali On Wed, Sep 27, 2017 at 12:48 AM, Joe Witt <[email protected]> wrote: > Ali > > 1) There are of course practical limits on how many input ports there > can be. Each of them do generate threads to manage those sockets. > However, many different edge systems can send to a single input port. > You can also demux the streams of data using flow file attributes so > there are various ways to tackle that. It wasn't tested against > thousands of edge systems sending to a central cluster as the more > common model in such a case is less of tons of spokes and one central > hub but rather spokes sending to regional clusters which send to > central cluster(s). That said, it is likely it will work quite well. > > 2) Site-to-site has load balancing and fail-over built-in to it. The > s2s exchange that happens when the connection is established and over > time is to share information about the cluster, how many nodes are in > it, and their relative load. This allows the clients to do weighted > distribution, detect new or removed nodes, etc.. > > 3) No you dont have to use the same version. This is another huge > benefit of s2s is that it was built with the recognition that it is > not possible or even desirable to upgrade all systems at once across a > large enterprise. The protocol involves both sides of s2s transfers > to exchange information about the flowfile/transfer protocol it > supports. So old nifi sending to new nifi and new nifi sending to an > old nifi are able to come to base agreement. The protocol and > serialization have been quite stable but still the ability to evolve > is baked in. > > Thanks > > On Tue, Sep 26, 2017 at 4:07 AM, Ali Nazemian <[email protected]> > wrote: > > Hi all, > > > > > > I am investigating the feasibility of using multiple Nifi clusters across > > the world to send a live traffic to a central Nifi cluster using the > > site-to-site. I have some questions regarding this matter: > > > > 1- Does it scale well? Is there any performance concerns/limitations I > need > > to consider? Can I send live traffic from thousands of Nifi cluster to a > > single Nifi cluster (let's suppose the central one is a huge cluster of > > Nifi)? Is there any limitation on the number of input ports for example? > > > > 2- How Nifi can handle load-balancing in the site-to-site situation? Is > that > > per session or flow or in a batch mode? I want to understand all the > > situations we may face regarding the network issues between different > Nifi > > clusters. > > > > 3- Do I need to use the same version of Nifi from the edge Nifis to the > > central Nifis? Is there any chance that the site-to-site communication > > changes significantly that we need to upgrade all the Nifi instances we > will > > have or it is pretty reliable now? > > > > > > Regards, > > Ali > -- A.Nazemian
