eylonronen commented on pull request #5299: URL: https://github.com/apache/nifi/pull/5299#issuecomment-898953890
Hi, from my experience with ES when communicating with the cluster you have a few options: 1. Communicating with one of the data nodes - this option is the least recommended as it's likely to cause very high CPU and heap usage and when ingesting/querying large amount of data the said data node is bound to crash 2. Using a coordinator node - What is means is to still communicate with a single node, but one that does not store data, which means that it's load is smaller, yet here as well, with enough data (unfortunately i have been there) it to crashes 3. Using an external load balancer such as Nginx or HAproxy - It is a legit option which guarantees zero crashing, unless your cluster is not large enough to handle that much querying/ingestion and in which case no way will work for you :( 4. Load balancing in the client - by far the best option because of the same pros that were introduced in the previous one with no need of maintaining another server(s) and another technology Basically at large scale the only valid options are 3 and 4, so to spare the need of a third-party load balancer i think that this feature is important. Also, you might look at the changes i made and observe that i implemented round robin load balancing (using Iterables.cycle()) which means the okHttp client still works with a single url every time, just a different one :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
