Hi Eduardo,

This looks like a networking issue. What is your cluster setup like?

Are you running on Amazon EC2? We have seen similar behavior before when
users were running Mesos on EC2. If I remember correctly, the fix was to to
use private ip addresses for master and slaves, instead of "localhost" or
"public ip".

@vinodkone


On Mon, Apr 15, 2013 at 10:13 AM, Eduardo Alfaia <[email protected]>
 wrote:

> Hi Guys,
> I am newer in Mesos and I am having some problems when running the launch
> mesos scripts bellow. Why does the master remove the slave? I have seen
> something about checkpoint.
>
> MASTER
> root@blockmon1:/opt/mesos-trunk/build/bin# ./mesos-master.sh
> I0415 18:00:47.543422 17720 main.cpp:116] Build: 2013-04-14 23:48:51 by
> root
> I0415 18:00:47.543926 17720 main.cpp:117] Starting Mesos master
> I0415 18:00:47.545109 17720 master.cpp:309] Master started on
> 127.0.1.1:5050
> I0415 18:00:47.545351 17720 master.cpp:324] Master ID:
> 201304151800-16842879-5050-17720
> I0415 18:00:47.545819 17720 master.cpp:603] Elected as master!
> W0415 18:00:47.546039 17737 master.cpp:81] No whitelist given. Advertising
> offers for all slaves
> W0415 18:00:52.547684 17736 master.cpp:81] No whitelist given. Advertising
> offers for all slaves
> W0415 18:00:57.550519 17736 master.cpp:81] No whitelist given. Advertising
> offers for all slaves
>
> se it is not checkpointing!
> I0415 18:01:59.379076 17735 hierarchical_allocator_process.hpp:423] Removed
> slave 201304151800-16842879-5050-17720-28
> I0415 18:02:00.379822 17737 master.cpp:968] Attempting to register slave on
> blockmon2 at slave(1)@127.0.1.1:36820
> I0415 18:02:00.380177 17737 master.cpp:1224] Master now considering a slave
> at blockmon2:36820 as active
> I0415 18:02:00.380561 17737 master.cpp:1862] Adding slave
> 201304151800-16842879-5050-17720-29 at blockmon2 with cpus=1; mem=979;
> ports=[31000-32000]; disk=2801
> I0415 18:02:00.380813 17737 hierarchical_allocator_process.hpp:395] Added
> slave 201304151800-16842879-5050-17720-29 (blockmon2) with cpus=1; mem=979;
> ports=[31000-32000]; disk=2801 (and cpus=1; mem=979; ports=[31000-32000];
> disk=2801 available)
> I0415 18:02:00.381255 17734 master.cpp:537] Slave
> 201304151800-16842879-5050-17720-29(blockmon2) disconnected
> I0415 18:02:00.381474 17734 master.cpp:542] Removing disconnected slave
> 201304151800-16842879-5050-17720-29(blockmon2) because it is not
> checkpointing!
> I0415 18:02:00.381882 17735 hierarchical_allocator_process.hpp:423] Removed
> slave 201304151800-16842879-5050-17720-29
>
> Thanks Guys
>
> --
> MSc Eduardo Costa Alfaia
> PhD Student
> Università degli Studi di Brescia
>



-- Vinod


On Mon, Apr 15, 2013 at 10:13 AM, Eduardo Alfaia
<[email protected]>wrote:

> Hi Guys,
> I am newer in Mesos and I am having some problems when running the launch
> mesos scripts bellow. Why does the master remove the slave? I have seen
> something about checkpoint.
>
> MASTER
> root@blockmon1:/opt/mesos-trunk/build/bin# ./mesos-master.sh
> I0415 18:00:47.543422 17720 main.cpp:116] Build: 2013-04-14 23:48:51 by
> root
> I0415 18:00:47.543926 17720 main.cpp:117] Starting Mesos master
> I0415 18:00:47.545109 17720 master.cpp:309] Master started on
> 127.0.1.1:5050
> I0415 18:00:47.545351 17720 master.cpp:324] Master ID:
> 201304151800-16842879-5050-17720
> I0415 18:00:47.545819 17720 master.cpp:603] Elected as master!
> W0415 18:00:47.546039 17737 master.cpp:81] No whitelist given. Advertising
> offers for all slaves
> W0415 18:00:52.547684 17736 master.cpp:81] No whitelist given. Advertising
> offers for all slaves
> W0415 18:00:57.550519 17736 master.cpp:81] No whitelist given. Advertising
> offers for all slaves
>
> se it is not checkpointing!
> I0415 18:01:59.379076 17735 hierarchical_allocator_process.hpp:423] Removed
> slave 201304151800-16842879-5050-17720-28
> I0415 18:02:00.379822 17737 master.cpp:968] Attempting to register slave on
> blockmon2 at slave(1)@127.0.1.1:36820
> I0415 18:02:00.380177 17737 master.cpp:1224] Master now considering a slave
> at blockmon2:36820 as active
> I0415 18:02:00.380561 17737 master.cpp:1862] Adding slave
> 201304151800-16842879-5050-17720-29 at blockmon2 with cpus=1; mem=979;
> ports=[31000-32000]; disk=2801
> I0415 18:02:00.380813 17737 hierarchical_allocator_process.hpp:395] Added
> slave 201304151800-16842879-5050-17720-29 (blockmon2) with cpus=1; mem=979;
> ports=[31000-32000]; disk=2801 (and cpus=1; mem=979; ports=[31000-32000];
> disk=2801 available)
> I0415 18:02:00.381255 17734 master.cpp:537] Slave
> 201304151800-16842879-5050-17720-29(blockmon2) disconnected
> I0415 18:02:00.381474 17734 master.cpp:542] Removing disconnected slave
> 201304151800-16842879-5050-17720-29(blockmon2) because it is not
> checkpointing!
> I0415 18:02:00.381882 17735 hierarchical_allocator_process.hpp:423] Removed
> slave 201304151800-16842879-5050-17720-29
>
> Thanks Guys
>
> --
> MSc Eduardo Costa Alfaia
> PhD Student
> Università degli Studi di Brescia
>

Reply via email to