Workers not running

Alex Coventry Wed, 15 Jan 2014 12:18:55 -0800

Thanks, Jon.  Yes, I have topology.debug set to true in the config map
passed to StormSubmitter/SumbitTopology, and I've changed all the "INFO"
settings to "DEBUG" in logback/cluster.xml.


I think you might be onto something with checking whether the JVMs are
running.  I noticed in the supervisor log: "Error when trying to kill 8060.
Process is probably already dead."  Nathan Marz says in this
thread<https://groups.google.com/forum/#!topic/storm-user/UxbMLIfEV1k>
that
it suggests native dependencies are not installed correctly.  I'm looking
for ways to test whether this is the case, now.  Any suggestions are
welcome.  Is there a way to kick off a worker thread in the storm repl, to
get some more info about how it's failing?

Best regards,
Alex

On Wed, Jan 15, 2014 at 3:04 PM, Jon Logan <[email protected]> wrote:

> I haven't looked at your logs, but just to clarify, the message you
> describe is probably a debug message. Debug messages say things like the
> emitting and receiving of tuples. I'm not sure if it's enabled by default
> in local mode, but if you want those messages in normal mode, you need to
> set topology.debug: true in the storm.yaml configuration file.
>
>
> You can also look to see if the worker JVMs are actually running on other
> machines. They should be in separate processes.
>
>
> On Wed, Jan 15, 2014 at 1:57 PM, Alex Coventry <[email protected]> wrote:
>
>> Thanks for the confirmation.  It appears that my topology is not running.
>>  When I run the word-count topology in local mode, I see messages like
>> "Emitting: 3 default ["dog"]".  I assume I should be seeing messages like
>> that in the logs when I run the topology in distributed mode, but I'm not.
>>  I'm also not seeing any exceptions.  Log messages related to communication
>> between nimbus, workers and supervisor appear for about three minutes, then
>> everything related to the topology seems to shut down, including the UI's
>> report that it's running.
>>
>> I'm pretty stumped by this, and I'd be grateful for any help.  I took a
>> copy of the logs for nimbus, the supervisor, and the workers, for the
>> lifecycle of the topology I just described:
>>
>>    https://dl.dropboxusercontent.com/u/6414090/logs.tgz
>>
>> I'd really appreciate it if someone with more experience with storm could
>> take a look at them and tell me where I'm going wrong.
>>
>> I'm using storm-0.9.0, storm-starter from git master.  ZK and nimbus are
>> running on the same machine, the supervisor/workers are running on another
>> machine.  Judging from the logs, they are all able to see each other.
>>
>> Best regards,
>> Alex
>>
>>
>>
>> On Wed, Jan 15, 2014 at 7:26 AM, Jason Trost <[email protected]>wrote:
>>
>>> This should show up in one or more worker log in
>>> $STORM_HOME/logs/worker-*.log.
>>>
>>>
>>> On Tue, Jan 14, 2014 at 10:25 PM, Alex Coventry <[email protected]>wrote:
>>>
>>>> If I explicitly throw an exception in the storm-starter clojure
>>>> example, as shown in the diff below, it shows up nicely when I run in local
>>>> mode with
>>>>
>>>>   coventry@samjoko:~/storm-starter$ lein run -m
>>>>  storm.starter.clj.word-count
>>>>
>>>> However, when I run it on a storm cluster with a command like
>>>>
>>>>   coventry@samjoko:~/storm-starter$
>>>> ~/storm-starter/target/storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar
>>>> storm.starter.clj.word_count count
>>>>
>>>> I am not sure where these errors are being reported.  Are they logged
>>>> anywhere, and if not, can I get them to be?
>>>>
>>>> Best regards,
>>>> Alex
>>>>
>>>> index ce2725d..c82fd0f 100644
>>>> --- a/src/clj/storm/starter/clj/word_count.clj
>>>> +++ b/src/clj/storm/starter/clj/word_count.clj
>>>> @@ -11,6 +11,7 @@
>>>>                     "an apple a day keeps the doctor away"]]
>>>>      (spout
>>>>       (nextTuple []
>>>> +       (throw (Exception. "Where does this show up?"))
>>>>         (Thread/sleep 100)
>>>>         (emit-spout! collector [(rand-nth sentences)])
>>>>         )
>>>>
>>>>
>>>
>>
>

Workers not running

Reply via email to