One additional thing.. we have two ES sinks actually pointing to the same cluster. The config looks more like this actually: (inbound avro -> FC1 -> ElasticSearch) (inbound avro -> FC2 -> S3/HDFS) (inbound avro_2 -> FC3 -> ElasticSearch) (inbound avro_2 -> FC4 -> S3/HDFS)
Matt Wise Sr. Systems Architect Nextdoor.com On Thu, Apr 10, 2014 at 9:02 AM, Matt Wise <[email protected]> wrote: > We use Flume 1.4 to pass logs into HDFS as well as ElasticSearch for > storage. The pipeline looks roughly like this: > > Client to Server Flow... > (local_app -> local_host_flume_agent) ---- AVRO/SSL ----> > (remote_flume_agent)... > > Agent Server Flow ... > (inbound avro -> FC1 -> ElasticSearch) > (inbound avro -> FC2 -> S3/HDFS) > > > In the last week we've made a few changes and now we're seeing a bit of a > problem. We'e seen 3 different occurrences of a single flume agent server > node beginning to back up its FC1 channel indefinitely until we log in and > restart Flume entirely. The data just stops flowing -- we can't find any > errors in the logs on either the ES or Flume side. A simple restart of > Flume fixes it. > > Our sink config looks like this: > >> agent.sinks.elasticsearch.type = >> org.apache.flume.sink.elasticsearch.ElasticSearchSink >> agent.sinks.elasticsearch.hostNames = xxx:9300 >> agent.sinks.elasticsearch.indexName = flume >> agent.sinks.elasticsearch.clusterName = >> flume-elasticsearch-production-useast1 >> agent.sinks.elasticsearch.batchSize = 1000 >> agent.sinks.elasticsearch.ttl = 30 >> agent.sinks.elasticsearch.serializer = >> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer >> agent.sinks.elasticsearch.channel = fc-unstructured-es > > > This ONLY happens at Midnight, and only happens on one flume server. I'm > wondering whether it has to do with the time it takes our ES nodes to > create a new index ... and the first flume agent that triggers "index > creation" could be getting blocked or stuck? > > Matt Wise > Sr. Systems Architect > Nextdoor.com >
