its fair to say that its true, but thrift monitoring by CDH relays no
errors and the thrift server itself is not outputing any errors
I do notice that after 30 minutes of storm's "thrift connection timeout"
that hbase thrift will slowly increase its open file count to dangerously
high values.

I also noticed that after killing the topology and all of the jps process
of storm (nimbus, ui and worker) the memory footprint on the server is not
released.  I did not list out what was consuming the storm servers memory
before restarting the machine.

so far restarting hbase, and l restarted the storm server, cleared out the
working directory, restarted storm and redeployed the topology and it seems
to be working again.



On Tue, May 12, 2015 at 2:25 PM, Jeffery Maass <[email protected]> wrote:

> The supervisor log you posted covers multiple different workers...  This
> looks expected to me.  If an unhandled exception occurs in a worker, it
> will die.  Then either nimbus or the supervisor will cease to see its
> heartbeats, the supervisor will attempt to kill it, then nimbus will ask a
> supervisor to start a new worker.
>
> The most relevant logs are in the worker log.  I'm betting the problem is
> connected to "thrift connection timeout".
>
> Sorry I couldn't be of more help.
>
>


-- 
Abraham Tom
Data Architect - RippleLabs.com

Reply via email to