You should check the log files to see the complete stack trace of the exceptions. That should held to identify the problem.
-Matthias On 05/15/2015 09:52 PM, Hadi Sotudeh wrote: > I've three Vms. > Now, I've run the commands you've said. > ---------------------------------------------------------------------- > nimbus and zookeeper: > Filesystem Size Used Avail Use% Mounted on > /dev/sda1 69G 5.6G 60G 9% / > udev 991M 4.0K 991M 1% /dev > tmpfs 400M 760K 399M 1% /run > none 5.0M 0 5.0M 0% /run/lock > none 1000M 144K 999M 1% /run/shm > ------------------------------------------------------------------------ > supervisor1: > /dev/sda1 69G 4.9G 61G 8% / > udev 991M 4.0K 991M 1% /dev > tmpfs 400M 752K 399M 1% /run > none 5.0M 0 5.0M 0% /run/lock > none 1000M 144K 999M 1% /run/shm > ----------------------------------------------------------------------- > supervisor2: > /dev/sda1 69G 4.7G 61G 8% / > udev 991M 4.0K 991M 1% /dev > tmpfs 400M 752K 399M 1% /run > none 5.0M 0 5.0M 0% /run/lock > none 1000M 152K 999M 1% /run/shm > /dev/sr0 114M 114M 0 100% /media/XenServer Tools > > ------------------------------------------------------------ > netstat for the nimbus and zookeeper: > > 1 CLOSE_WAIT > 1 established) > 1 Foreign > 5 TIME_WAIT > 14 LISTEN > 97 ESTABLISHED > ---------------------------------------------------------- > netstat for the supervisor1: > 1 CLOSE_WAIT > 1 established) > 1 Foreign > 1 TIME_WAIT > 3 LAST_ACK > 7 LISTEN > 121 ESTABLISHED > --------------------------------------------------------- > netstat for the supervisor2: > 1 CLOSE_WAIT > 1 established) > 1 ESTABLISHED > 1 Foreign > 6 LISTEN > 12 TIME_WAIT > --------------------------------------------------------------------------- > > about top command:everything is good now, I think this part depends on > the zookeeper's configuration. > > my zookeeper's configuation is: > > # The number of milliseconds of each tick > tickTime=2000 > # The number of ticks that the initial > # synchronization phase can take > initLimit=10 > # The number of ticks that can pass between > # sending a request and getting an acknowledgement > syncLimit=5 > # the directory where the snapshot is stored. > # do not use /tmp for storage, /tmp here is just > # example sakes. > dataDir=/var/zookeeper > # the port at which the clients will connect > clientPort=2181 > # the maximum number of client connections. > # increase this if you need to handle more clients > #maxClientCnxns=60 > # > # Be sure to read the maintenance section of the > # administrator guide before turning on autopurge. > # > # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance > # > # The number of snapshots to retain in dataDir > #autopurge.snapRetainCount=3 > # Purge task interval in hours > # Set to "0" to disable auto purge feature > #autopurge.purgeInterval=1 > autopurge.purgeInterval=24 > autopurge.snapRetainCount=5 > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > I've monitoring using storm UI and I've found that all of supervisors, > zookeeper and nimbus are up and there are erros front of some of my > spouts/bolts like: > > java.lang.RuntimeException: java.io.EOFException at > backtype.storm.spout.ShellSpout.querySubprocess(ShellSpout.java:119) at > backtype.storm.spout.ShellSpout.nextTuple(ShellSpout.java:68) at backtype > > > java.lang.RuntimeException: java.io.IOException: Broken pipe at > backtype.storm.spout.ShellSpout.querySubprocess(ShellSpout.java:119) at > backtype.storm.spout.ShellSpout.nextTuple(ShellSpout.java:68) > > > java.lang.RuntimeException: java.io.EOFException at > backtype.storm.spout.ShellSpout.querySubprocess(ShellSpout.java:119) at > backtype.storm.spout.ShellSpout.nextTuple(ShellSpout.java:68) at backtype > > > > java.lang.RuntimeException: java.lang.RuntimeException: > java.io.EOFException at > backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128) > at backtype.storm.utils.DisruptorQue > > ------------------------------------------------------------------------------------------------------------------------------- > There is no other application on the storm worker > > ------------------------- > > Any help? > > Thanks > Hadi
signature.asc
Description: OpenPGP digital signature
