The easiest way to fix these issues is to copy/paste the command found in the supervisor`s log file. In your case, it`s the line starting by: 2014-02-19 09:52:00 b.s.d.supervisor [INFO] Launching worker with command: java -server -Xmx2048m.....
Copy the whole command and try to run it manually on your supervisor. You will most likely received an error about a configuration option on your command line. Once you found it, simply adjust your storm.yaml configuration file on all your supervisors and you should be fine. Good luck From: Chad Harland [mailto:[email protected]] Sent: February-19-14 12:45 PM To: [email protected] Subject: Workers Fail To Start I'd appreciate any insight you all may be able to provide with an issue I'm facing. I've run this topology in local mode without issue. However, when deployed to my cluster (2 supervisors) my workers fail to start. The worker logs on each node are empty. The supervisor logs on each node look like this: 2014-02-19 09:51:57 b.s.d.supervisor [INFO] Downloading code for storm id opixrs-5-1392825117 from /data/storm/nimbus/stormdist/opixrs-5-1392825117 2014-02-19 09:52:00 b.s.util [INFO] Could not extract resources from /data/storm/supervisor/tmp/c1a236df-ddeb-4bc1-824a-de40dc1888fd/stormjar.jar 2014-02-19 09:52:00 b.s.d.supervisor [INFO] Finished downloading code for storm id opixrs-5-1392825117 from /data/storm/nimbus/stormdist/opixrs-5-1392825117 2014-02-19 09:52:00 b.s.d.supervisor [INFO] Launching worker with assignment #backtype.storm.daemon.supervisor.LocalAssignment{:storm-id "opixrs-5-1392825117", :executors ([5 5] [9 9] [1 1])} for this supervisor e1534292-5a48-4306-b52f-fc80812b12ba on port 6703 with id 539e69da-8eab-4773-b51f-8f70b1bc6222 2014-02-19 09:52:00 b.s.d.supervisor [INFO] Launching worker with command: java -server -Xmx2048m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+UseConcMa ...snip.... /data/storm/supervisor/stormdist/opixrs-5-1392825117/stormjar.jar ...snip.... 2014-02-19 09:52:00 b.s.d.supervisor [INFO] 539e69da-8eab-4773-b51f-8f70b1bc6222 still hasn't started 2014-02-19 09:52:01 b.s.d.supervisor [INFO] 539e69da-8eab-4773-b51f-8f70b1bc6222 still hasn't started 2 Many lines of the worker "still hasn't started" and then try again. The Nimbus log shows successful topology submission, but then complains about: 2014-02-19 09:53:59 b.s.d.nimbus [INFO] Executor opixrs-5-1392825117:[2 2] not alive 2014-02-19 09:53:59 b.s.d.nimbus [INFO] Executor opixrs-5-1392825117:[3 3] not alive ... and then reassigns the topology slots and tries again. I don't see my topology jar anywhere on the supervisor nodes. I didn't include most of the java call in the above log snippet, but the -cp option referencing the stormdist directory (between snips) points to a directory that doesn't exist. My first thought was some kind of permissions issue, but even after updating stormdist and the supervisor directories to be wide open I still face the same issue. Then I checked hosts files and verified I could access each server from/to Nimbus and Zookeeper. Any ideas? Where should I be looking? Anybody face similar issues in the past? Thanks again for your help. -Chad ________________________________ Aucun virus trouvé dans ce message. Analyse effectuée par AVG - www.avg.fr<http://www.avg.fr> Version: 2014.0.4259 / Base de données virale: 3705/7091 - Date: 13/02/2014
