Well. After investigating the problem. It turns out to be a setting
problem.

My old storm task never ran correctly as it kept trying to connect a wrong
zookeeper server.

The way I started the mesos is following. I use zookeeper to store
configuration but this zookeeper is embedded and running standalone on
master node(192.168.1.11).

nohup sudo /home/ubuntu/mesos/build/bin/mesos-master.sh
--work_dir=/var/lib/mesos --zk=zk://0.0.0.0:2181/mesos --quorum=1
--log_dir=/var/log/mesos </dev/null >/dev/null 2>&1 &


And I started the slave using following command.

nohup sudo /home/ubuntu/mesos/build/bin/mesos-slave.sh --master=zk://
192.168.123.19:2181/mesos --log_dir=/var/log/mesos </dev/null >/dev/null
2>&1 &


Under this situation, everything looks fine.

To set up the storm framework, I change the storm.yaml in the conf folder.

mesos.master.url: "zk://192.168.123.19:2181/mesos"
storm.zookeeper.servers:
    - "localhost"
nimbus.host: "localhost"

and running "storm-mesos nimbus" and "storm ui".

The problem raised here in this configuration.For every task, storm-mesos
created an executor with  storm-mesos environment by downloading the full
tar ball from either http from memosphere or hdfs which includes
 configuration file may or may not as same as the one using in the master
node. In my case,  all executor using the above configuration.  The new
created executor would fetch from zookeeper server to but here what it
tried to talked with is still "localhost". The zookeeper server exists on
the slave but never used for mesos, so the task were marked as LOST.  To
make it work,In my case, the set up should be like this.( I assume I also
need to change the nimbus host to the master node").

mesos.master.url: "zk://192.168.123.19:2181/mesos"
storm.zookeeper.servers:
    - "192.168.123.19"
nimbus.host: "192.168.123.19"


After making this change, everything works fine now.
I hope this would help people having the same issues.

Meanwhile. In my opinion, they way downloading whole tarball with fixed
configuration from somewhere should be avoided or improved.

Probably worth to discuss.

Thanks.






-Luyi.




On Thu, Sep 18, 2014 at 11:36 AM, Luyi Wang <[email protected]> wrote:

> I attached nimbus.log and supervisor.log for your reference
>
>
> On Wed, Sep 17, 2014 at 5:30 PM, Benjamin Mahler <
> [email protected]> wrote:
>
>> logs
>
>
>
>
> -Luyi.
>
>
>
>

Reply via email to