Hmm, overriding the implementation of getLifecycleState provided by AbstractSource could work. It would be going against the convention that has been maintained in all other components(that I can think of)

On 01/17/2013 01:20 PM, Brock Noland wrote:
Hi,

Yes I can definitely see the issue. It sucks that we'd have to add yet
another thread. An alternative which wouldn't require another thread
would be to check the optional interface in the supervisor,
approximately here:

https://github.com/apache/flume/blob/trunk/flume-ng-core/src/main/java/org/apache/flume/lifecycle/LifecycleSupervisor.java#L240

However, I am not sold on the supervisor being the best place to fix
this as I am not sure that other lifecycle components would need this.



Brock

On Wed, Jan 16, 2013 at 7:45 PM, Juhani Connolly
<[email protected]> wrote:
I came upon an issue with ScribeSource,  though it's theoretically
applicable to any EventDrivenSource whose event generating thread(s) die.
Simple put, sending a bad packet to the thrift(scribe protocol) port will
result in it trying to allocate space for some arbitrarily large packet
resulting in an OOMException which kills the thread(incidentally I thought
this would be an issue in avro too, but it throws an exception before making
excessive allocation requests).

As far as flume is concerned, the component is still alive. stop() was never
called, so even monitoring the component state using jmx will not notice
anything wrong. This situation occurs from user error, but there is
potential for other errors leaving a zombie component. I think it would be
more user friendly to be able to recover from such errors.

I'm thinking of adding a StatusPollable interface that EventDrivenSources
can optionally implement(because we can't change the interface without a
version change). If implemented, the EventDrivenSourceRunner would schedule
a regular poll to check the state. Upon failure it could either call stop()
to signal it broke. With autoRestartPolicy, the source would then get
restarted by its supervisor.

Would appreciate any opinions before I put together a patch/post an issue.



Reply via email to