The easiest way to fix these issues is to copy/paste the command found in the 
supervisor`s log file.
In your case, it`s the line starting by:
2014-02-19 09:52:00 b.s.d.supervisor [INFO] Launching worker with command: java 
-server -Xmx2048m.....

Copy the whole command and try to run it manually on your supervisor. You will 
most likely received an error about a configuration option on your command 
line. Once you found it, simply adjust your storm.yaml configuration file on 
all your supervisors and you should be fine.

Good luck

From: Chad Harland [mailto:[email protected]]
Sent: February-19-14 12:45 PM
To: [email protected]
Subject: Workers Fail To Start

I'd appreciate any insight you all may be able to provide with an issue I'm 
facing.

I've run this topology in local mode without issue. However, when deployed to 
my cluster (2 supervisors) my workers fail to start.

The worker logs on each node are empty.

The supervisor logs on each node look like this:
2014-02-19 09:51:57 b.s.d.supervisor [INFO] Downloading code for storm id 
opixrs-5-1392825117 from /data/storm/nimbus/stormdist/opixrs-5-1392825117
2014-02-19 09:52:00 b.s.util [INFO] Could not extract resources from 
/data/storm/supervisor/tmp/c1a236df-ddeb-4bc1-824a-de40dc1888fd/stormjar.jar
2014-02-19 09:52:00 b.s.d.supervisor [INFO] Finished downloading code for storm 
id opixrs-5-1392825117 from /data/storm/nimbus/stormdist/opixrs-5-1392825117
2014-02-19 09:52:00 b.s.d.supervisor [INFO] Launching worker with assignment 
#backtype.storm.daemon.supervisor.LocalAssignment{:storm-id 
"opixrs-5-1392825117", :executors ([5 5] [9 9] [1 1])} for this supervisor 
e1534292-5a48-4306-b52f-fc80812b12ba on port 6703 with id 
539e69da-8eab-4773-b51f-8f70b1bc6222
2014-02-19 09:52:00 b.s.d.supervisor [INFO] Launching worker with command: java 
-server -Xmx2048m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+UseConcMa
...snip....
/data/storm/supervisor/stormdist/opixrs-5-1392825117/stormjar.jar
...snip....
2014-02-19 09:52:00 b.s.d.supervisor [INFO] 
539e69da-8eab-4773-b51f-8f70b1bc6222 still hasn't started
2014-02-19 09:52:01 b.s.d.supervisor [INFO] 
539e69da-8eab-4773-b51f-8f70b1bc6222 still hasn't started
2

Many lines of the worker "still hasn't started" and then try again.

The Nimbus log shows successful topology submission, but then complains about:
2014-02-19 09:53:59 b.s.d.nimbus [INFO] Executor opixrs-5-1392825117:[2 2] not 
alive
2014-02-19 09:53:59 b.s.d.nimbus [INFO] Executor opixrs-5-1392825117:[3 3] not 
alive

... and then reassigns the topology slots and tries again.

I don't see my topology jar anywhere on the supervisor nodes.

I didn't include most of the java call in the above log snippet, but the -cp 
option referencing the stormdist directory (between snips) points to a 
directory that doesn't exist.

My first thought was some kind of permissions issue, but even after updating 
stormdist and the supervisor directories to be wide open I still face the same 
issue. Then I checked hosts files and verified I could access each server 
from/to Nimbus and Zookeeper.

Any ideas? Where should I be looking? Anybody face similar issues in the past?

Thanks again for your help.

-Chad


________________________________
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr<http://www.avg.fr>
Version: 2014.0.4259 / Base de données virale: 3705/7091 - Date: 13/02/2014

Reply via email to