Hi guys,

During our test about hadoop on mesos, we ran into a master crash issue
this morning. We are using the code from the trunk, and I pull the code 2
weeks ago.

I've checked the log of master, there is nothing strange in the log. And
 there is no error log in master.

I have another question about mesos failover.
If we use single master configuration, when the master crashes, I restart
it(same ip and port). But all the slaves can not reregister to the new
master. What is the purpose of this design?

Thanks.


the last part of log is like this.
Production: [qa@hd1dz] ~/mesos-log$ tail
mesos-master.hd1dz.prod.mediav.com.root.log.INFO.20130510-142017.27386
I0522 11:04:04.255683 27392 master.cpp:1498] Processing reply for offer
201305101420-252063498-5050-27386-194813 on slave
201305101420-252063498-5050-27386-5 (hd9dz.prod.mediav.com) for framework
201305101420-252063498-5050-27386-0056
I0522 11:04:04.255748 27393 hierarchical_allocator_process.hpp:497]
Framework 201305101420-252063498-5050-27386-0056 filtered slave
201305101420-252063498-5050-27386-3 for 5.000000000000000secs
I0522 11:04:04.255822 27392 master.cpp:1498] Processing reply for offer
201305101420-252063498-5050-27386-194814 on slave
201305101420-252063498-5050-27386-1 (hd3dz.prod.mediav.com) for framework
201305101420-252063498-5050-27386-0056
I0522 11:04:04.255862 27393 hierarchical_allocator_process.hpp:497]
Framework 201305101420-252063498-5050-27386-0056 filtered slave
201305101420-252063498-5050-27386-5 for 5.000000000000000secs
I0522 11:04:04.255936 27392 master.cpp:1498] Processing reply for offer
201305101420-252063498-5050-27386-194815 on slave
201305101420-252063498-5050-27386-2 (hd5dz.prod.mediav.com) for framework
201305101420-252063498-5050-27386-0056
I0522 11:04:04.255980 27393 hierarchical_allocator_process.hpp:497]
Framework 201305101420-252063498-5050-27386-0056 filtered slave
201305101420-252063498-5050-27386-1 for 5.000000000000000secs
I0522 11:04:04.256060 27392 master.cpp:1498] Processing reply for offer
201305101420-252063498-5050-27386-194816 on slave
201305101420-252063498-5050-27386-0 (hd2dz.prod.mediav.com) for framework
201305101420-252063498-5050-27386-0056
I0522 11:04:04.256098 27393 hierarchical_allocator_process.hpp:497]
Framework 201305101420-252063498-5050-27386-0056 filtered slave
201305101420-252063498-5050-27386-2 for 5.000000000000000secs
I0522 11:04:04.256216 27393 hierarchical_allocator_process.hpp:497]
Framework 201305101420-252063498-5050-27386-0056 filtered slave
201305101420-252063498-5050-27386-0 for 5.000000000000000secs
W0522 11:04:07.555552 27394 master.cpp:82] No whitelist given. Advertising
offers for all slaves
Production: [qa@hd1dz] ~/mesos-log$ tail
mesos-master.hd1dz.prod.mediav.com.root.log.WARNING.20130510-142017.27386
W0522 11:03:22.105432 27394 master.cpp:82] No whitelist given. Advertising
offers for all slaves
W0522 11:03:27.152434 27398 master.cpp:82] No whitelist given. Advertising
offers for all slaves
W0522 11:03:32.153389 27393 master.cpp:82] No whitelist given. Advertising
offers for all slaves
W0522 11:03:37.549747 27389 master.cpp:82] No whitelist given. Advertising
offers for all slaves
W0522 11:03:42.550670 27391 master.cpp:82] No whitelist given. Advertising
offers for all slaves
W0522 11:03:47.551592 27396 master.cpp:82] No whitelist given. Advertising
offers for all slaves
W0522 11:03:52.552641 27399 master.cpp:82] No whitelist given. Advertising
offers for all slaves
W0522 11:03:57.553750 27392 master.cpp:82] No whitelist given. Advertising
offers for all slaves
W0522 11:04:02.554628 27397 master.cpp:82] No whitelist given. Advertising
offers for all slaves
W0522 11:04:07.555552 27394 master.cpp:82] No whitelist given. Advertising
offers for all slaves




Guodong

Reply via email to