eylonronen commented on pull request #5299:
URL: https://github.com/apache/nifi/pull/5299#issuecomment-898953890


   Hi, from my experience with ES when communicating with the cluster you have 
a few options:
   1. Communicating with one of the data nodes - this option is the least 
recommended as it's likely to cause very high CPU and heap usage and when 
ingesting/querying large amount of data the said data node is bound to crash
   2. Using a coordinator node - What is means is to still communicate with a 
single node, but one that does not store data, which means that it's load is 
smaller, yet here as well, with enough data (unfortunately i have been there) 
it to crashes
   3. Using an external load balancer such as Nginx or HAproxy - It is a legit 
option which guarantees zero crashing, unless your cluster is not large enough 
to handle that much querying/ingestion and in which case no way will work for 
you :(
   4. Load balancing in the client - by far the best option because of the same 
pros that were introduced in the previous one with no need of maintaining 
another server(s) and another technology
   
   Basically at large scale the only valid options are 3 and 4, so to spare the 
need of a third-party load balancer i think that this feature is important. 
Also, you might look at the changes i made and observe that i implemented round 
robin load balancing (using Iterables.cycle()) which means the okHttp client 
still works with a single url every time, just a different one :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to