Hi Vinod, I will give you more info as long as I find the reproduce process.
Thanks. Guodong On Tue, Apr 16, 2013 at 9:44 AM, Vinod Kone <[email protected]> wrote: > Hi Guodong, > > This is a known issue https://issues.apache.org/jira/browse/MESOS-300 , > though we were unable to find the root cause so far. If you are able to > consistently reproduce this behavior, please detail the steps to reproduce > this on that ticket. That would help us diagnose/fix this. > > Thanks, > > > > On Mon, Apr 15, 2013 at 6:39 PM, 王国栋 <[email protected]> wrote: > > > hi, > > > > When I start 2 slaves to register to the master one by one, and then > > refresh the master http monitoring page, the master crashes. I restart > the > > master(without restart slave), and it crashes again when I refresh the > > page( > > http://localhost:5050). > > The log of the master is as follow. > > > > I0416 09:36:35.325883 4910 main.cpp:116] Build: 2013-04-15 10:31:38 by > > guodong > > I0416 09:36:35.326510 4910 main.cpp:117] Starting Mesos master > > I0416 09:36:35.326918 4927 master.cpp:309] Master started on > > 127.0.1.1:5050 > > I0416 09:36:35.327046 4927 master.cpp:324] Master ID: > > 201304160936-16842879-5050-4910 > > W0416 09:36:35.327277 4924 master.cpp:81] No whitelist given. > Advertising > > offers for all slaves > > I0416 09:36:35.329146 4927 master.cpp:603] Elected as master! > > I0416 09:36:35.811032 4925 master.cpp:968] Attempting to register slave > on > > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056 > > I0416 09:36:35.811139 4925 master.cpp:1224] Master now considering a > slave > > at guodong-OptiPlex-990:46056 as active > > I0416 09:36:35.811259 4925 master.cpp:1862] Adding slave > > 201304160936-16842879-5050-4910-0 at guodong-OptiPlex-990 with cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 > > I0416 09:36:35.811786 4925 master.cpp:537] Slave > > 201304160936-16842879-5050-4910-0(guodong-OptiPlex-990) disconnected > > I0416 09:36:35.811841 4925 master.cpp:542] Removing disconnected slave > > 201304160936-16842879-5050-4910-0(guodong-OptiPlex-990) because it is not > > checkpointing! > > I0416 09:36:35.811887 4924 hierarchical_allocator_process.hpp:395] Added > > slave 201304160936-16842879-5050-4910-0 (guodong-OptiPlex-990) with > cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897; > > ports=[31000-32000]; disk=399611 available) > > I0416 09:36:35.812057 4924 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304160936-16842879-5050-4910-0 > > I0416 09:36:36.812571 4925 master.cpp:968] Attempting to register slave > on > > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056 > > I0416 09:36:36.812705 4925 master.cpp:1224] Master now considering a > slave > > at guodong-OptiPlex-990:46056 as active > > I0416 09:36:36.812855 4925 master.cpp:1862] Adding slave > > 201304160936-16842879-5050-4910-1 at guodong-OptiPlex-990 with cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 > > I0416 09:36:36.813194 4925 master.cpp:537] Slave > > 201304160936-16842879-5050-4910-1(guodong-OptiPlex-990) disconnected > > I0416 09:36:36.813256 4925 master.cpp:542] Removing disconnected slave > > 201304160936-16842879-5050-4910-1(guodong-OptiPlex-990) because it is not > > checkpointing! > > I0416 09:36:36.813294 4926 hierarchical_allocator_process.hpp:395] Added > > slave 201304160936-16842879-5050-4910-1 (guodong-OptiPlex-990) with > cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897; > > ports=[31000-32000]; disk=399611 available) > > I0416 09:36:36.813433 4926 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304160936-16842879-5050-4910-1 > > I0416 09:36:37.814275 4925 master.cpp:968] Attempting to register slave > on > > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056 > > I0416 09:36:37.814412 4925 master.cpp:1224] Master now considering a > slave > > at guodong-OptiPlex-990:46056 as active > > I0416 09:36:37.814467 4925 master.cpp:1862] Adding slave > > 201304160936-16842879-5050-4910-2 at guodong-OptiPlex-990 with cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 > > I0416 09:36:37.814831 4925 master.cpp:537] Slave > > 201304160936-16842879-5050-4910-2(guodong-OptiPlex-990) disconnected > > I0416 09:36:37.814882 4925 master.cpp:542] Removing disconnected slave > > 201304160936-16842879-5050-4910-2(guodong-OptiPlex-990) because it is not > > checkpointing! > > I0416 09:36:37.814900 4924 hierarchical_allocator_process.hpp:395] Added > > slave 201304160936-16842879-5050-4910-2 (guodong-OptiPlex-990) with > cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897; > > ports=[31000-32000]; disk=399611 available) > > I0416 09:36:37.815040 4924 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304160936-16842879-5050-4910-2 > > I0416 09:36:38.815996 4925 master.cpp:968] Attempting to register slave > on > > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056 > > I0416 09:36:38.816112 4925 master.cpp:1224] Master now considering a > slave > > at guodong-OptiPlex-990:46056 as active > > I0416 09:36:38.816213 4925 master.cpp:1862] Adding slave > > 201304160936-16842879-5050-4910-3 at guodong-OptiPlex-990 with cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 > > I0416 09:36:38.816763 4925 master.cpp:537] Slave > > 201304160936-16842879-5050-4910-3(guodong-OptiPlex-990) disconnected > > I0416 09:36:38.816838 4924 hierarchical_allocator_process.hpp:395] Added > > slave 201304160936-16842879-5050-4910-3 (guodong-OptiPlex-990) with > cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897; > > ports=[31000-32000]; disk=399611 available) > > I0416 09:36:38.816999 4925 master.cpp:542] Removing disconnected slave > > 201304160936-16842879-5050-4910-3(guodong-OptiPlex-990) because it is not > > checkpointing! > > I0416 09:36:38.817443 4927 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304160936-16842879-5050-4910-3 > > I0416 09:36:39.817735 4925 master.cpp:968] Attempting to register slave > on > > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056 > > I0416 09:36:39.818085 4925 master.cpp:1224] Master now considering a > slave > > at guodong-OptiPlex-990:46056 as active > > I0416 09:36:39.818320 4925 master.cpp:1862] Adding slave > > 201304160936-16842879-5050-4910-4 at guodong-OptiPlex-990 with cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 > > I0416 09:36:39.818814 4925 master.cpp:537] Slave > > 201304160936-16842879-5050-4910-4(guodong-OptiPlex-990) disconnected > > I0416 09:36:39.818878 4927 hierarchical_allocator_process.hpp:395] Added > > slave 201304160936-16842879-5050-4910-4 (guodong-OptiPlex-990) with > cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897; > > ports=[31000-32000]; disk=399611 available) > > I0416 09:36:39.818980 4925 master.cpp:542] Removing disconnected slave > > 201304160936-16842879-5050-4910-4(guodong-OptiPlex-990) because it is not > > checkpointing! > > I0416 09:36:39.819612 4924 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304160936-16842879-5050-4910-4 > > W0416 09:36:40.328641 4925 master.cpp:81] No whitelist given. > Advertising > > offers for all slaves > > I0416 09:36:40.819702 4926 master.cpp:968] Attempting to register slave > on > > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056 > > I0416 09:36:40.820044 4926 master.cpp:1224] Master now considering a > slave > > at guodong-OptiPlex-990:46056 as active > > I0416 09:36:40.820314 4926 master.cpp:1862] Adding slave > > 201304160936-16842879-5050-4910-5 at guodong-OptiPlex-990 with cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 > > I0416 09:36:40.820997 4926 master.cpp:537] Slave > > 201304160936-16842879-5050-4910-5(guodong-OptiPlex-990) disconnected > > I0416 09:36:40.821081 4924 hierarchical_allocator_process.hpp:395] Added > > slave 201304160936-16842879-5050-4910-5 (guodong-OptiPlex-990) with > cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897; > > ports=[31000-32000]; disk=399611 available) > > I0416 09:36:40.821246 4926 master.cpp:542] Removing disconnected slave > > 201304160936-16842879-5050-4910-5(guodong-OptiPlex-990) because it is not > > checkpointing! > > I0416 09:36:40.821862 4927 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304160936-16842879-5050-4910-5 > > I0416 09:36:41.821005 4926 master.cpp:968] Attempting to register slave > on > > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056 > > I0416 09:36:41.821271 4926 master.cpp:1224] Master now considering a > slave > > at guodong-OptiPlex-990:46056 as active > > I0416 09:36:41.821451 4926 master.cpp:1862] Adding slave > > 201304160936-16842879-5050-4910-6 at guodong-OptiPlex-990 with cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 > > I0416 09:36:41.821929 4926 master.cpp:537] Slave > > 201304160936-16842879-5050-4910-6(guodong-OptiPlex-990) disconnected > > I0416 09:36:41.822000 4924 hierarchical_allocator_process.hpp:395] Added > > slave 201304160936-16842879-5050-4910-6 (guodong-OptiPlex-990) with > cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897; > > ports=[31000-32000]; disk=399611 available) > > I0416 09:36:41.822161 4926 master.cpp:542] Removing disconnected slave > > 201304160936-16842879-5050-4910-6(guodong-OptiPlex-990) because it is not > > checkpointing! > > I0416 09:36:41.822906 4925 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304160936-16842879-5050-4910-6 > > I0416 09:36:42.822779 4927 master.cpp:968] Attempting to register slave > on > > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056 > > I0416 09:36:42.823669 4927 master.cpp:1224] Master now considering a > slave > > at guodong-OptiPlex-990:46056 as active > > I0416 09:36:42.824539 4927 master.cpp:1862] Adding slave > > 201304160936-16842879-5050-4910-7 at guodong-OptiPlex-990 with cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 > > I0416 09:36:42.892474 4927 master.cpp:537] Slave > > 201304160936-16842879-5050-4910-7(guodong-OptiPlex-990) disconnected > > I0416 09:36:42.908231 4927 master.cpp:542] Removing disconnected slave > > 201304160936-16842879-5050-4910-7(guodong-OptiPlex-990) because it is not > > checkpointing! > > I0416 09:36:42.892504 4925 hierarchical_allocator_process.hpp:395] Added > > slave 201304160936-16842879-5050-4910-7 (guodong-OptiPlex-990) with > cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897; > > ports=[31000-32000]; disk=399611 available) > > I0416 09:36:42.908664 4925 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304160936-16842879-5050-4910-7 > > I0416 09:36:43.824604 4926 master.cpp:968] Attempting to register slave > on > > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056 > > I0416 09:36:43.825044 4926 master.cpp:1224] Master now considering a > slave > > at guodong-OptiPlex-990:46056 as active > > I0416 09:36:43.825316 4926 master.cpp:1862] Adding slave > > 201304160936-16842879-5050-4910-8 at guodong-OptiPlex-990 with cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 > > I0416 09:36:43.825809 4926 master.cpp:537] Slave > > 201304160936-16842879-5050-4910-8(guodong-OptiPlex-990) disconnected > > I0416 09:36:43.825865 4927 hierarchical_allocator_process.hpp:395] Added > > slave 201304160936-16842879-5050-4910-8 (guodong-OptiPlex-990) with > cpus=4; > > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897; > > ports=[31000-32000]; disk=399611 available) > > I0416 09:36:43.825969 4926 master.cpp:542] Removing disconnected slave > > 201304160936-16842879-5050-4910-8(guodong-OptiPlex-990) because it is not > > checkpointing! > > I0416 09:36:43.826431 4924 hierarchical_allocator_process.hpp:423] > Removed > > slave 201304160936-16842879-5050-4910-8 > > F0416 09:36:44.198364 4928 process.cpp:1967] Check failed: > > outgoing.count(s) > 0 > > *** Check failure stack trace: *** > > @ 0x7f71eabed14d google::LogMessage::Fail() > > @ 0x7f71eabf09df google::LogMessage::SendToLog() > > @ 0x7f71eabeff17 google::LogMessage::Flush() > > @ 0x7f71eabf0ebd google::LogMessageFatal::~LogMessageFatal() > > @ 0x7f71eab03207 process::SocketManager::next() > > @ 0x7f71eab0765b process::send_data() > > @ 0x7f71eac2fdb1 ev_invoke_pending > > @ 0x7f71eac348fd ev_loop > > @ 0x7f71eaafc70b process::serve() > > @ 0x7f71e8f0fe9a start_thread > > @ 0x7f71e8c3ccbd (unknown) > > Aborted (core dumped) > > > > > > Guodong > > >
