Hi Guodong,

This is a known issue https://issues.apache.org/jira/browse/MESOS-300 ,
though we were unable to find the root cause so far. If you are able to
consistently reproduce this behavior, please detail the steps to reproduce
this on that ticket. That would help us diagnose/fix this.

Thanks,



On Mon, Apr 15, 2013 at 6:39 PM, 王国栋 <[email protected]> wrote:

> hi,
>
> When I  start 2 slaves to register to the master one by one, and then
> refresh the master http monitoring page, the master crashes. I restart the
> master(without restart slave), and it crashes again when I refresh the
> page(
> http://localhost:5050).
> The log of the master is as follow.
>
> I0416 09:36:35.325883  4910 main.cpp:116] Build: 2013-04-15 10:31:38 by
> guodong
> I0416 09:36:35.326510  4910 main.cpp:117] Starting Mesos master
> I0416 09:36:35.326918  4927 master.cpp:309] Master started on
> 127.0.1.1:5050
> I0416 09:36:35.327046  4927 master.cpp:324] Master ID:
> 201304160936-16842879-5050-4910
> W0416 09:36:35.327277  4924 master.cpp:81] No whitelist given. Advertising
> offers for all slaves
> I0416 09:36:35.329146  4927 master.cpp:603] Elected as master!
> I0416 09:36:35.811032  4925 master.cpp:968] Attempting to register slave on
> guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> I0416 09:36:35.811139  4925 master.cpp:1224] Master now considering a slave
> at guodong-OptiPlex-990:46056 as active
> I0416 09:36:35.811259  4925 master.cpp:1862] Adding slave
> 201304160936-16842879-5050-4910-0 at guodong-OptiPlex-990 with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611
> I0416 09:36:35.811786  4925 master.cpp:537] Slave
> 201304160936-16842879-5050-4910-0(guodong-OptiPlex-990) disconnected
> I0416 09:36:35.811841  4925 master.cpp:542] Removing disconnected slave
> 201304160936-16842879-5050-4910-0(guodong-OptiPlex-990) because it is not
> checkpointing!
> I0416 09:36:35.811887  4924 hierarchical_allocator_process.hpp:395] Added
> slave 201304160936-16842879-5050-4910-0 (guodong-OptiPlex-990) with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> ports=[31000-32000]; disk=399611 available)
> I0416 09:36:35.812057  4924 hierarchical_allocator_process.hpp:423] Removed
> slave 201304160936-16842879-5050-4910-0
> I0416 09:36:36.812571  4925 master.cpp:968] Attempting to register slave on
> guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> I0416 09:36:36.812705  4925 master.cpp:1224] Master now considering a slave
> at guodong-OptiPlex-990:46056 as active
> I0416 09:36:36.812855  4925 master.cpp:1862] Adding slave
> 201304160936-16842879-5050-4910-1 at guodong-OptiPlex-990 with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611
> I0416 09:36:36.813194  4925 master.cpp:537] Slave
> 201304160936-16842879-5050-4910-1(guodong-OptiPlex-990) disconnected
> I0416 09:36:36.813256  4925 master.cpp:542] Removing disconnected slave
> 201304160936-16842879-5050-4910-1(guodong-OptiPlex-990) because it is not
> checkpointing!
> I0416 09:36:36.813294  4926 hierarchical_allocator_process.hpp:395] Added
> slave 201304160936-16842879-5050-4910-1 (guodong-OptiPlex-990) with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> ports=[31000-32000]; disk=399611 available)
> I0416 09:36:36.813433  4926 hierarchical_allocator_process.hpp:423] Removed
> slave 201304160936-16842879-5050-4910-1
> I0416 09:36:37.814275  4925 master.cpp:968] Attempting to register slave on
> guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> I0416 09:36:37.814412  4925 master.cpp:1224] Master now considering a slave
> at guodong-OptiPlex-990:46056 as active
> I0416 09:36:37.814467  4925 master.cpp:1862] Adding slave
> 201304160936-16842879-5050-4910-2 at guodong-OptiPlex-990 with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611
> I0416 09:36:37.814831  4925 master.cpp:537] Slave
> 201304160936-16842879-5050-4910-2(guodong-OptiPlex-990) disconnected
> I0416 09:36:37.814882  4925 master.cpp:542] Removing disconnected slave
> 201304160936-16842879-5050-4910-2(guodong-OptiPlex-990) because it is not
> checkpointing!
> I0416 09:36:37.814900  4924 hierarchical_allocator_process.hpp:395] Added
> slave 201304160936-16842879-5050-4910-2 (guodong-OptiPlex-990) with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> ports=[31000-32000]; disk=399611 available)
> I0416 09:36:37.815040  4924 hierarchical_allocator_process.hpp:423] Removed
> slave 201304160936-16842879-5050-4910-2
> I0416 09:36:38.815996  4925 master.cpp:968] Attempting to register slave on
> guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> I0416 09:36:38.816112  4925 master.cpp:1224] Master now considering a slave
> at guodong-OptiPlex-990:46056 as active
> I0416 09:36:38.816213  4925 master.cpp:1862] Adding slave
> 201304160936-16842879-5050-4910-3 at guodong-OptiPlex-990 with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611
> I0416 09:36:38.816763  4925 master.cpp:537] Slave
> 201304160936-16842879-5050-4910-3(guodong-OptiPlex-990) disconnected
> I0416 09:36:38.816838  4924 hierarchical_allocator_process.hpp:395] Added
> slave 201304160936-16842879-5050-4910-3 (guodong-OptiPlex-990) with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> ports=[31000-32000]; disk=399611 available)
> I0416 09:36:38.816999  4925 master.cpp:542] Removing disconnected slave
> 201304160936-16842879-5050-4910-3(guodong-OptiPlex-990) because it is not
> checkpointing!
> I0416 09:36:38.817443  4927 hierarchical_allocator_process.hpp:423] Removed
> slave 201304160936-16842879-5050-4910-3
> I0416 09:36:39.817735  4925 master.cpp:968] Attempting to register slave on
> guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> I0416 09:36:39.818085  4925 master.cpp:1224] Master now considering a slave
> at guodong-OptiPlex-990:46056 as active
> I0416 09:36:39.818320  4925 master.cpp:1862] Adding slave
> 201304160936-16842879-5050-4910-4 at guodong-OptiPlex-990 with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611
> I0416 09:36:39.818814  4925 master.cpp:537] Slave
> 201304160936-16842879-5050-4910-4(guodong-OptiPlex-990) disconnected
> I0416 09:36:39.818878  4927 hierarchical_allocator_process.hpp:395] Added
> slave 201304160936-16842879-5050-4910-4 (guodong-OptiPlex-990) with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> ports=[31000-32000]; disk=399611 available)
> I0416 09:36:39.818980  4925 master.cpp:542] Removing disconnected slave
> 201304160936-16842879-5050-4910-4(guodong-OptiPlex-990) because it is not
> checkpointing!
> I0416 09:36:39.819612  4924 hierarchical_allocator_process.hpp:423] Removed
> slave 201304160936-16842879-5050-4910-4
> W0416 09:36:40.328641  4925 master.cpp:81] No whitelist given. Advertising
> offers for all slaves
> I0416 09:36:40.819702  4926 master.cpp:968] Attempting to register slave on
> guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> I0416 09:36:40.820044  4926 master.cpp:1224] Master now considering a slave
> at guodong-OptiPlex-990:46056 as active
> I0416 09:36:40.820314  4926 master.cpp:1862] Adding slave
> 201304160936-16842879-5050-4910-5 at guodong-OptiPlex-990 with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611
> I0416 09:36:40.820997  4926 master.cpp:537] Slave
> 201304160936-16842879-5050-4910-5(guodong-OptiPlex-990) disconnected
> I0416 09:36:40.821081  4924 hierarchical_allocator_process.hpp:395] Added
> slave 201304160936-16842879-5050-4910-5 (guodong-OptiPlex-990) with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> ports=[31000-32000]; disk=399611 available)
> I0416 09:36:40.821246  4926 master.cpp:542] Removing disconnected slave
> 201304160936-16842879-5050-4910-5(guodong-OptiPlex-990) because it is not
> checkpointing!
> I0416 09:36:40.821862  4927 hierarchical_allocator_process.hpp:423] Removed
> slave 201304160936-16842879-5050-4910-5
> I0416 09:36:41.821005  4926 master.cpp:968] Attempting to register slave on
> guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> I0416 09:36:41.821271  4926 master.cpp:1224] Master now considering a slave
> at guodong-OptiPlex-990:46056 as active
> I0416 09:36:41.821451  4926 master.cpp:1862] Adding slave
> 201304160936-16842879-5050-4910-6 at guodong-OptiPlex-990 with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611
> I0416 09:36:41.821929  4926 master.cpp:537] Slave
> 201304160936-16842879-5050-4910-6(guodong-OptiPlex-990) disconnected
> I0416 09:36:41.822000  4924 hierarchical_allocator_process.hpp:395] Added
> slave 201304160936-16842879-5050-4910-6 (guodong-OptiPlex-990) with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> ports=[31000-32000]; disk=399611 available)
> I0416 09:36:41.822161  4926 master.cpp:542] Removing disconnected slave
> 201304160936-16842879-5050-4910-6(guodong-OptiPlex-990) because it is not
> checkpointing!
> I0416 09:36:41.822906  4925 hierarchical_allocator_process.hpp:423] Removed
> slave 201304160936-16842879-5050-4910-6
> I0416 09:36:42.822779  4927 master.cpp:968] Attempting to register slave on
> guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> I0416 09:36:42.823669  4927 master.cpp:1224] Master now considering a slave
> at guodong-OptiPlex-990:46056 as active
> I0416 09:36:42.824539  4927 master.cpp:1862] Adding slave
> 201304160936-16842879-5050-4910-7 at guodong-OptiPlex-990 with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611
> I0416 09:36:42.892474  4927 master.cpp:537] Slave
> 201304160936-16842879-5050-4910-7(guodong-OptiPlex-990) disconnected
> I0416 09:36:42.908231  4927 master.cpp:542] Removing disconnected slave
> 201304160936-16842879-5050-4910-7(guodong-OptiPlex-990) because it is not
> checkpointing!
> I0416 09:36:42.892504  4925 hierarchical_allocator_process.hpp:395] Added
> slave 201304160936-16842879-5050-4910-7 (guodong-OptiPlex-990) with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> ports=[31000-32000]; disk=399611 available)
> I0416 09:36:42.908664  4925 hierarchical_allocator_process.hpp:423] Removed
> slave 201304160936-16842879-5050-4910-7
> I0416 09:36:43.824604  4926 master.cpp:968] Attempting to register slave on
> guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> I0416 09:36:43.825044  4926 master.cpp:1224] Master now considering a slave
> at guodong-OptiPlex-990:46056 as active
> I0416 09:36:43.825316  4926 master.cpp:1862] Adding slave
> 201304160936-16842879-5050-4910-8 at guodong-OptiPlex-990 with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611
> I0416 09:36:43.825809  4926 master.cpp:537] Slave
> 201304160936-16842879-5050-4910-8(guodong-OptiPlex-990) disconnected
> I0416 09:36:43.825865  4927 hierarchical_allocator_process.hpp:395] Added
> slave 201304160936-16842879-5050-4910-8 (guodong-OptiPlex-990) with cpus=4;
> mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> ports=[31000-32000]; disk=399611 available)
> I0416 09:36:43.825969  4926 master.cpp:542] Removing disconnected slave
> 201304160936-16842879-5050-4910-8(guodong-OptiPlex-990) because it is not
> checkpointing!
> I0416 09:36:43.826431  4924 hierarchical_allocator_process.hpp:423] Removed
> slave 201304160936-16842879-5050-4910-8
> F0416 09:36:44.198364  4928 process.cpp:1967] Check failed:
> outgoing.count(s) > 0
> *** Check failure stack trace: ***
>     @     0x7f71eabed14d  google::LogMessage::Fail()
>     @     0x7f71eabf09df  google::LogMessage::SendToLog()
>     @     0x7f71eabeff17  google::LogMessage::Flush()
>     @     0x7f71eabf0ebd  google::LogMessageFatal::~LogMessageFatal()
>     @     0x7f71eab03207  process::SocketManager::next()
>     @     0x7f71eab0765b  process::send_data()
>     @     0x7f71eac2fdb1  ev_invoke_pending
>     @     0x7f71eac348fd  ev_loop
>     @     0x7f71eaafc70b  process::serve()
>     @     0x7f71e8f0fe9a  start_thread
>     @     0x7f71e8c3ccbd  (unknown)
> Aborted (core dumped)
>
>
> Guodong
>

Reply via email to