Hi Vinod, thanks by your fast replay I'm not using EC2 but I'm using the name of server like, for example blockmon1.ing.unibs.it. Could be this?
I'm using 3 nodes ( 1 Master and 2 Slaves) Regards 2013/4/15 Vinod Kone <[email protected]> > Hi Eduardo, > > This looks like a networking issue. What is your cluster setup like? > > Are you running on Amazon EC2? We have seen similar behavior before when > users were running Mesos on EC2. If I remember correctly, the fix was to to > use private ip addresses for master and slaves, instead of "localhost" or > "public ip". > > @vinodkone > > > On Mon, Apr 15, 2013 at 10:13 AM, Eduardo Alfaia <[email protected] > > > wrote: > > > Hi Guys, > > I am newer in Mesos and I am having some problems when running the launch > > mesos scripts bellow. Why does the master remove the slave? I have seen > > something about checkpoint. > > > > MASTER > > root@blockmon1:/opt/mesos-trunk/build/bin# ./mesos-master.sh > > I0415 18:00:47.543422 17720 main.cpp:116] Build: 2013-04-14 23:48:51 by > > root > > I0415 18:00:47.543926 17720 main.cpp:117] Starting Mesos master > > I0415 18:00:47.545109 17720 master.cpp:309] Master started on > > 127.0.1.1:5050 > > I0415 18:00:47.545351 17720 master.cpp:324] Master ID: > > 201304151800-16842879-5050-17720 > > I0415 18:00:47.545819 17720 master.cpp:603] Elected as master! > > W0415 18:00:47.546039 17737 master.cpp:81] No whitelist given. > Advertising > > offers for all slaves > > W0415 18:00:52.547684 17736 master.cpp:81] No whitelist given. > Advertising > > offers for all slaves > > W0415 18:00:57.550519 17736 master.cpp:81] No whitelist given. > Advertising > > offers for all slaves > > > > se it is not checkpointing! > > I0415 18:01:59.379076 17735 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304151800-16842879-5050-17720-28 > > I0415 18:02:00.379822 17737 master.cpp:968] Attempting to register slave > on > > blockmon2 at slave(1)@127.0.1.1:36820 > > I0415 18:02:00.380177 17737 master.cpp:1224] Master now considering a > slave > > at blockmon2:36820 as active > > I0415 18:02:00.380561 17737 master.cpp:1862] Adding slave > > 201304151800-16842879-5050-17720-29 at blockmon2 with cpus=1; mem=979; > > ports=[31000-32000]; disk=2801 > > I0415 18:02:00.380813 17737 hierarchical_allocator_process.hpp:395] Added > > slave 201304151800-16842879-5050-17720-29 (blockmon2) with cpus=1; > mem=979; > > ports=[31000-32000]; disk=2801 (and cpus=1; mem=979; ports=[31000-32000]; > > disk=2801 available) > > I0415 18:02:00.381255 17734 master.cpp:537] Slave > > 201304151800-16842879-5050-17720-29(blockmon2) disconnected > > I0415 18:02:00.381474 17734 master.cpp:542] Removing disconnected slave > > 201304151800-16842879-5050-17720-29(blockmon2) because it is not > > checkpointing! > > I0415 18:02:00.381882 17735 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304151800-16842879-5050-17720-29 > > > > Thanks Guys > > > > -- > > MSc Eduardo Costa Alfaia > > PhD Student > > Università degli Studi di Brescia > > > > > > -- Vinod > > > On Mon, Apr 15, 2013 at 10:13 AM, Eduardo Alfaia > <[email protected]>wrote: > > > Hi Guys, > > I am newer in Mesos and I am having some problems when running the launch > > mesos scripts bellow. Why does the master remove the slave? I have seen > > something about checkpoint. > > > > MASTER > > root@blockmon1:/opt/mesos-trunk/build/bin# ./mesos-master.sh > > I0415 18:00:47.543422 17720 main.cpp:116] Build: 2013-04-14 23:48:51 by > > root > > I0415 18:00:47.543926 17720 main.cpp:117] Starting Mesos master > > I0415 18:00:47.545109 17720 master.cpp:309] Master started on > > 127.0.1.1:5050 > > I0415 18:00:47.545351 17720 master.cpp:324] Master ID: > > 201304151800-16842879-5050-17720 > > I0415 18:00:47.545819 17720 master.cpp:603] Elected as master! > > W0415 18:00:47.546039 17737 master.cpp:81] No whitelist given. > Advertising > > offers for all slaves > > W0415 18:00:52.547684 17736 master.cpp:81] No whitelist given. > Advertising > > offers for all slaves > > W0415 18:00:57.550519 17736 master.cpp:81] No whitelist given. > Advertising > > offers for all slaves > > > > se it is not checkpointing! > > I0415 18:01:59.379076 17735 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304151800-16842879-5050-17720-28 > > I0415 18:02:00.379822 17737 master.cpp:968] Attempting to register slave > on > > blockmon2 at slave(1)@127.0.1.1:36820 > > I0415 18:02:00.380177 17737 master.cpp:1224] Master now considering a > slave > > at blockmon2:36820 as active > > I0415 18:02:00.380561 17737 master.cpp:1862] Adding slave > > 201304151800-16842879-5050-17720-29 at blockmon2 with cpus=1; mem=979; > > ports=[31000-32000]; disk=2801 > > I0415 18:02:00.380813 17737 hierarchical_allocator_process.hpp:395] Added > > slave 201304151800-16842879-5050-17720-29 (blockmon2) with cpus=1; > mem=979; > > ports=[31000-32000]; disk=2801 (and cpus=1; mem=979; ports=[31000-32000]; > > disk=2801 available) > > I0415 18:02:00.381255 17734 master.cpp:537] Slave > > 201304151800-16842879-5050-17720-29(blockmon2) disconnected > > I0415 18:02:00.381474 17734 master.cpp:542] Removing disconnected slave > > 201304151800-16842879-5050-17720-29(blockmon2) because it is not > > checkpointing! > > I0415 18:02:00.381882 17735 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304151800-16842879-5050-17720-29 > > > > Thanks Guys > > > > -- > > MSc Eduardo Costa Alfaia > > PhD Student > > Università degli Studi di Brescia > > > -- MSc Eduardo Costa Alfaia PhD Student Università degli Studi di Brescia
