Hi Srimanth, Thanks for responding. I've checked the logs and it seems that the shutdown event is received, and it is closed for some channels (we have 3 channels) but it just continues to run. For example I can see entries like:
18 May 2016 08:13:21,142 INFO [agent-shutdown-hook] (com.aweber.flume.source.rabbitmq.RabbitMQSource.stop:117) - Stopping channel1-source 18 May 2016 08:13:21,142 INFO [agent-shutdown-hook] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:149) - Component type: SOURCE, name: channel1-source stopped But it looks like it continues to process events. I can see entries like this repeated over and over, you can see this is around 30 mins after it tried to stop: 18 May 2016 08:47:06,778 INFO [pool-57-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.run:143) - Attributes for component SOURCE.channel1-source 18 May 2016 08:47:06,778 INFO [pool-57-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:163) - EventReceivedCount = 36417 18 May 2016 08:47:06,778 INFO [pool-57-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:163) - AppendBatchAcceptedCount = 0 18 May 2016 08:47:06,778 INFO [pool-57-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:163) - EventAcceptedCount = 36417 18 May 2016 08:47:06,778 INFO [pool-57-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:163) - AppendReceivedCount = 0 18 May 2016 08:47:06,778 INFO [pool-57-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:163) - StartTime = 1463486595420 18 May 2016 08:47:06,778 INFO [pool-57-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:163) - AppendAcceptedCount = 0 18 May 2016 08:47:06,779 INFO [pool-57-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:163) - OpenConnectionCount = 2 18 May 2016 08:47:06,779 INFO [pool-57-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:163) - AppendBatchReceivedCount = 0 18 May 2016 08:47:06,779 INFO [pool-57-thread-1] (org.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink$TimelineMetricsCollector.processComponentAttributes:163) - StopTime = 1463555601142 Is this normal behavior? We are using this plugin: https://github.com/aweber/rabbitmq-flume-plugin I have thought about switching to this plugin: https://github.com/jcustenborder/flume-ng-rabbitmq To see if the problem goes away. Thanks! On Tue, May 17, 2016 at 5:29 PM, Srimanth Gunturi <[email protected]> wrote: > ​Hello, > > Could you please describe the setup a little bit more? Are 12 flume agents > on 12 different hosts or on a single host? > > Also, have you looked at the flume logs for the those 2 agents to > determine what is going on during the 45 minutes? > > Regards, > > Srimanth > > > ------------------------------ > *From:* cs user <[email protected]> > *Sent:* Tuesday, May 17, 2016 4:44 AM > *To:* [email protected] > *Subject:* Flume - always unable to stop 2 flume agents > > Hello, > > We have 12 flume agents. Whenever we change the config and need to restart > the affected nodes, we always end up with 2 flume agents which refuse to > stop, it takes multiple attempts (sometimes this takes as long as 45 mins) > to eventually stop the agents. You have to keep trying to restart them. > > Has anyone else seen this? Is there a work around? > > Thanks! >
