Hello, I have an issue trying to bootstrap an embedded flume node within a custom service. I'm using the same sort of start physical node, configure and spawn logical node methodology that flumebase employs:
https://github.com/flumebase/flumebase/blob/master/src/main/java/com/odiago/flumebase/flume/EmbeddedFlumeConfig.java When using the cdh3u1 release and configured to either use an embedded or an external flume master, I get (upon first time execution) a "Hearbeats are backing up" log message, followed by what looks to be the decommissioning of my logical node ("Forcing driver to exit uncleanly") Subsequent manual configuration of the embedded node through the master with the same sink (custom plugin) and source (rpcSource(54010)) picks up the configuration correctly and the embedded node works fine. I'm trying to eliminate or understand the root cause of the "Forcing driver to exit uncleanly" message when the logical node is configured and the spawned into the physical node for the first time. -----------Log snippet--------------- 01:59:02.457 [Heartbeat] INFO c.e.t.flume.TimeSeriesAggregatorSink - EventSink builder() : building event sink 01:59:02.457 [Heartbeat] INFO c.c.flume.agent.LogicalNodeManager - creating new logical node timeseries-aggregator-host.local.domain 01:59:02.470 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (HELLO,Wed Dec 31 19:00:00 EST 1969) 01:59:02.472 [logicalNode timeseries-aggregator-host.local.domain-32] INFO c.c.f.h.thrift.ThriftEventSource - Starting blocking thread pool server on port 54010... 01:59:02.475 [logicalNode timeseries-aggregator-host.local.domain-32] DEBUG c.c.f.core.connector.DirectDriver - Starting driver ThriftEventSource | TimeSeriesAggregatorSink 01:59:02.479 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:02.475 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 01:59:02.480 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - Attempt to load config com.cloudera.flume.conf.FlumeConfigData@449afc 01:59:02.480 [Check config] INFO c.e.t.flume.TimeSeriesAggregatorSink - EventSink builder() : building event sink 01:59:07.482 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:07.484 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:12.487 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:12.489 [Heartbeat] WARN c.c.flume.agent.LivenessManager - Heartbeats are backing up, currently behind by 1 heartbeats 01:59:12.490 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:17.493 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:17.494 [Heartbeat] WARN c.c.flume.agent.LivenessManager - Heartbeats are backing up, currently behind by 2 heartbeats 01:59:17.495 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:22.501 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:22.502 [Heartbeat] WARN c.c.flume.agent.LivenessManager - Heartbeats are backing up, currently behind by 3 heartbeats 01:59:22.502 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:27.505 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:27.506 [Heartbeat] WARN c.c.flume.agent.LivenessManager - Heartbeats are backing up, currently behind by 4 heartbeats 01:59:27.506 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:32.509 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:32.511 [Heartbeat] WARN c.c.flume.agent.LivenessManager - Heartbeats are backing up, currently behind by 5 heartbeats 01:59:32.511 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:37.514 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:37.516 [Heartbeat] WARN c.c.flume.agent.LivenessManager - Heartbeats are backing up, currently behind by 6 heartbeats 01:59:37.516 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:42.519 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:42.521 [Heartbeat] WARN c.c.flume.agent.LivenessManager - Heartbeats are backing up, currently behind by 7 heartbeats 01:59:42.521 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:47.524 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:47.526 [Heartbeat] WARN c.c.flume.agent.LivenessManager - Heartbeats are backing up, currently behind by 8 heartbeats 01:59:47.526 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:52.529 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:52.532 [Heartbeat] WARN c.c.flume.agent.LivenessManager - Heartbeats are backing up, currently behind by 9 heartbeats 01:59:52.532 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 01:59:57.535 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Wed Dec 31 19:00:00 EST 1969) 01:59:57.537 [Heartbeat] WARN c.c.flume.agent.LivenessManager - Heartbeats are backing up, currently behind by 10 heartbeats 01:59:57.537 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 02:00:02.480 [Check config] ERROR com.cloudera.flume.agent.LogicalNode - Forcing driver to exit uncleanly 02:00:02.481 [Check config] INFO com.cloudera.flume.agent.LogicalNode - Node config successfully set to com.cloudera.flume.conf.FlumeConfigData@449afc 02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.481 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.481 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.482 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.482 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.483 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.483 [Check config] DEBUG c.c.flume.agent.LivenessManager - Taking another heartbeat 02:00:02.483 [Check config] DEBUG com.cloudera.flume.agent.LogicalNode - do nothing: retrieved config (Thu Aug 11 01:58:57 EDT 2011) same as current (Thu Aug 11 01:58:57 EDT 2011). 02:00:02.483 [Thread-4] DEBUG c.c.flume.reporter.ReportEvent - Ignoring null string metric message Aug 11, 2011 2:00:02 AM com.twitter.common.application.modules.AppLauncherModule$1 uncaughtException SEVERE: Uncaught exception from Thread[Thread-4,5,main] java.lang.NullPointerException at com.cloudera.flume.handlers.thrift.ThriftEventSource.getMetrics(ThriftEventSource.java:99) at com.cloudera.flume.core.EventSource$Base.getReports(EventSource.java:186) at com.cloudera.flume.agent.LogicalNode.getReports(LogicalNode.java:309) at com.cloudera.flume.reporter.MasterReportPusher$PusherThread.querySrcSinkReports(MasterReportPusher.java:102) at com.cloudera.flume.reporter.MasterReportPusher$PusherThread.sendReports(MasterReportPusher.java:110) at com.cloudera.flume.reporter.MasterReportPusher$PusherThread.run(MasterReportPusher.java:119) 02:00:02.487 [logicalNode timeseries-aggregator-host.local.domain-34] INFO c.c.f.h.thrift.ThriftEventSource - Starting blocking thread pool server on port 54010... 02:00:02.488 [logicalNode timeseries-aggregator-host.local.domain-34] ERROR o.a.t.server.TSaneThreadPoolServer - Error occurred during listening. org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:54010. at org.apache.thrift.transport.TSaneServerSocket.bind(TSaneServerSocket.java:110) ~[flume-core-0.9.4-cdh3u1.jar:0.6.1] at org.apache.thrift.transport.TSaneServerSocket.listen(TSaneServerSocket.java:116) ~[flume-core-0.9.4-cdh3u1.jar:0.6.1] at org.apache.thrift.server.TSaneThreadPoolServer.start(TSaneThreadPoolServer.java:162) ~[flume-core-0.9.4-cdh3u1.jar:0.6.1] at com.cloudera.flume.handlers.thrift.ThriftEventSource.open(ThriftEventSource.java:151) [flume-core-0.9.4-cdh3u1.jar:na] at com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:87) [flume-core-0.9.4-cdh3u1.jar:na] 02:00:02.488 [logicalNode timeseries-aggregator-host.local.domain-34] DEBUG c.c.f.core.connector.DirectDriver - Starting driver ThriftEventSource | TimeSeriesAggregatorSink 02:00:02.539 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11 01:58:57 EDT 2011) 02:00:02.539 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 02:00:02.555 [logicalNode timeseries-aggregator-host.local.domain-32] ERROR c.c.f.core.connector.DirectDriver - Closing down due to exception during append calls java.io.IOException: Waiting for queue element was interrupted! null at com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:222) ~[flume-core-0.9.4-cdh3u1.jar:na] at com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:105) ~[flume-core-0.9.4-cdh3u1.jar:na] Caused by: java.lang.InterruptedException: null at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017) ~[na:1.6.0_22] at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2094) ~[na:1.6.0_22] at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:423) ~[na:1.6.0_22] at com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:209) ~[flume-core-0.9.4-cdh3u1.jar:na] ... 1 common frames omitted 02:00:02.557 [logicalNode timeseries-aggregator-host.local.domain-32] INFO c.c.f.core.connector.DirectDriver - Connector logicalNode timeseries-aggregator-host.local.domain-32 exited with error: Waiting for queue element was interrupted! null java.io.IOException: Waiting for queue element was interrupted! null at com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:222) ~[flume-core-0.9.4-cdh3u1.jar:na] at com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:105) ~[flume-core-0.9.4-cdh3u1.jar:na] Caused by: java.lang.InterruptedException: null at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017) ~[na:1.6.0_22] at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2094) ~[na:1.6.0_22] at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:423) ~[na:1.6.0_22] at com.cloudera.flume.handlers.thrift.ThriftEventSource.next(ThriftEventSource.java:209) ~[flume-core-0.9.4-cdh3u1.jar:na] ... 1 common frames omitted 02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] INFO c.c.f.h.thrift.ThriftEventSource - Closed server on port 54010... 02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] INFO c.c.f.h.thrift.ThriftEventSource - Queue still has 0 elements ... 02:00:02.558 [logicalNode timeseries-aggregator-host.local.domain-32] ERROR c.c.f.core.connector.DirectDriver - Exiting driver logicalNode timeseries-aggregator-host.local.domain-32 in error state ThriftEventSource | TimeSeriesAggregatorSink because Waiting for queue element was interrupted! null 02:00:07.540 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11 01:58:57 EDT 2011) 02:00:07.541 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 02:00:12.542 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11 01:58:57 EDT 2011) 02:00:12.542 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 02:00:17.545 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11 01:58:57 EDT 2011) 02:00:17.546 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 02:00:22.549 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11 01:58:57 EDT 2011) 02:00:22.549 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} 02:00:27.552 [pool-5-thread-1] DEBUG c.c.flume.master.StatusManager - Heartbeat from host:host.local.domain logicalnode:timeseries-aggregator-host.local.domain (ACTIVE,Thu Aug 11 01:58:57 EDT 2011) 02:00:27.553 [Heartbeat] DEBUG c.cloudera.flume.agent.WALAckManager - agent acks waiting for master: {} ---------------------------
