Nikolaos Tsipas created FLUME-2230:
--------------------------------------
Summary: Configuring elasticsearch-sink hostNames parameter in AWS
Key: FLUME-2230
URL: https://issues.apache.org/jira/browse/FLUME-2230
Project: Flume
Issue Type: Question
Environment: AWS, centos 6
Reporter: Nikolaos Tsipas
Hello,
We are using flume elasticsearch-sink in AWS and for the {{hostNames}}
parameter of the sink we use the A record of an internal ELB (Elastic Load
Balancer). When we do a {{nslookup}} on the load balancer's hostname we get
back a list of node IPs (in our case we have 3 elasticsearch nodes).
*config example*
{code}
a1.sinks.elasticsearch-sink.hostNames =
componen-1RSEO3YX5OD3Z-729046292.eu-west-1.elb.amazonaws.com
{code}
This config works fine as long as the IPs of the elasticsearch nodes remain the
same. If we restart one of our elasticsearch nodes, a new IP is assigned to it
and flume stops being able to communicate with that node.
In the source code of the elasticsearch-sink I can see that a list of
{{InetSocketTransportAddress}} objects is created and this is probably the
reason why flume stops working when we have an IP change and starts working
only after a restart of the flume-ng-agent service.
*ElasticSearchSink.java*
{code}serverAddresses[i] = new InetSocketTransportAddress(host, port);{code}
+Questions+:
* Which is the suggested configuration for our case? Should we use static IPs
for our elasticsearch nodes and then use a comma separated list of these IPs in
flume configuration?
* Would be possible to use the A record of the ELB in such a way that flume
would always hit the A record to get one of the available IP addresses? Does
this sound feasible and worth spending some time on submitting a patch?
Regards,
Nick
--
This message was sent by Atlassian JIRA
(v6.1#6144)