Hi Vinod,

I will give you more info as long as I find the reproduce process.

Thanks.

Guodong


On Tue, Apr 16, 2013 at 9:44 AM, Vinod Kone <[email protected]> wrote:

> Hi Guodong,
>
> This is a known issue https://issues.apache.org/jira/browse/MESOS-300 ,
> though we were unable to find the root cause so far. If you are able to
> consistently reproduce this behavior, please detail the steps to reproduce
> this on that ticket. That would help us diagnose/fix this.
>
> Thanks,
>
>
>
> On Mon, Apr 15, 2013 at 6:39 PM, 王国栋 <[email protected]> wrote:
>
> > hi,
> >
> > When I  start 2 slaves to register to the master one by one, and then
> > refresh the master http monitoring page, the master crashes. I restart
> the
> > master(without restart slave), and it crashes again when I refresh the
> > page(
> > http://localhost:5050).
> > The log of the master is as follow.
> >
> > I0416 09:36:35.325883  4910 main.cpp:116] Build: 2013-04-15 10:31:38 by
> > guodong
> > I0416 09:36:35.326510  4910 main.cpp:117] Starting Mesos master
> > I0416 09:36:35.326918  4927 master.cpp:309] Master started on
> > 127.0.1.1:5050
> > I0416 09:36:35.327046  4927 master.cpp:324] Master ID:
> > 201304160936-16842879-5050-4910
> > W0416 09:36:35.327277  4924 master.cpp:81] No whitelist given.
> Advertising
> > offers for all slaves
> > I0416 09:36:35.329146  4927 master.cpp:603] Elected as master!
> > I0416 09:36:35.811032  4925 master.cpp:968] Attempting to register slave
> on
> > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> > I0416 09:36:35.811139  4925 master.cpp:1224] Master now considering a
> slave
> > at guodong-OptiPlex-990:46056 as active
> > I0416 09:36:35.811259  4925 master.cpp:1862] Adding slave
> > 201304160936-16842879-5050-4910-0 at guodong-OptiPlex-990 with cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611
> > I0416 09:36:35.811786  4925 master.cpp:537] Slave
> > 201304160936-16842879-5050-4910-0(guodong-OptiPlex-990) disconnected
> > I0416 09:36:35.811841  4925 master.cpp:542] Removing disconnected slave
> > 201304160936-16842879-5050-4910-0(guodong-OptiPlex-990) because it is not
> > checkpointing!
> > I0416 09:36:35.811887  4924 hierarchical_allocator_process.hpp:395] Added
> > slave 201304160936-16842879-5050-4910-0 (guodong-OptiPlex-990) with
> cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> > ports=[31000-32000]; disk=399611 available)
> > I0416 09:36:35.812057  4924 hierarchical_allocator_process.hpp:423]
> Removed
> > slave 201304160936-16842879-5050-4910-0
> > I0416 09:36:36.812571  4925 master.cpp:968] Attempting to register slave
> on
> > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> > I0416 09:36:36.812705  4925 master.cpp:1224] Master now considering a
> slave
> > at guodong-OptiPlex-990:46056 as active
> > I0416 09:36:36.812855  4925 master.cpp:1862] Adding slave
> > 201304160936-16842879-5050-4910-1 at guodong-OptiPlex-990 with cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611
> > I0416 09:36:36.813194  4925 master.cpp:537] Slave
> > 201304160936-16842879-5050-4910-1(guodong-OptiPlex-990) disconnected
> > I0416 09:36:36.813256  4925 master.cpp:542] Removing disconnected slave
> > 201304160936-16842879-5050-4910-1(guodong-OptiPlex-990) because it is not
> > checkpointing!
> > I0416 09:36:36.813294  4926 hierarchical_allocator_process.hpp:395] Added
> > slave 201304160936-16842879-5050-4910-1 (guodong-OptiPlex-990) with
> cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> > ports=[31000-32000]; disk=399611 available)
> > I0416 09:36:36.813433  4926 hierarchical_allocator_process.hpp:423]
> Removed
> > slave 201304160936-16842879-5050-4910-1
> > I0416 09:36:37.814275  4925 master.cpp:968] Attempting to register slave
> on
> > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> > I0416 09:36:37.814412  4925 master.cpp:1224] Master now considering a
> slave
> > at guodong-OptiPlex-990:46056 as active
> > I0416 09:36:37.814467  4925 master.cpp:1862] Adding slave
> > 201304160936-16842879-5050-4910-2 at guodong-OptiPlex-990 with cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611
> > I0416 09:36:37.814831  4925 master.cpp:537] Slave
> > 201304160936-16842879-5050-4910-2(guodong-OptiPlex-990) disconnected
> > I0416 09:36:37.814882  4925 master.cpp:542] Removing disconnected slave
> > 201304160936-16842879-5050-4910-2(guodong-OptiPlex-990) because it is not
> > checkpointing!
> > I0416 09:36:37.814900  4924 hierarchical_allocator_process.hpp:395] Added
> > slave 201304160936-16842879-5050-4910-2 (guodong-OptiPlex-990) with
> cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> > ports=[31000-32000]; disk=399611 available)
> > I0416 09:36:37.815040  4924 hierarchical_allocator_process.hpp:423]
> Removed
> > slave 201304160936-16842879-5050-4910-2
> > I0416 09:36:38.815996  4925 master.cpp:968] Attempting to register slave
> on
> > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> > I0416 09:36:38.816112  4925 master.cpp:1224] Master now considering a
> slave
> > at guodong-OptiPlex-990:46056 as active
> > I0416 09:36:38.816213  4925 master.cpp:1862] Adding slave
> > 201304160936-16842879-5050-4910-3 at guodong-OptiPlex-990 with cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611
> > I0416 09:36:38.816763  4925 master.cpp:537] Slave
> > 201304160936-16842879-5050-4910-3(guodong-OptiPlex-990) disconnected
> > I0416 09:36:38.816838  4924 hierarchical_allocator_process.hpp:395] Added
> > slave 201304160936-16842879-5050-4910-3 (guodong-OptiPlex-990) with
> cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> > ports=[31000-32000]; disk=399611 available)
> > I0416 09:36:38.816999  4925 master.cpp:542] Removing disconnected slave
> > 201304160936-16842879-5050-4910-3(guodong-OptiPlex-990) because it is not
> > checkpointing!
> > I0416 09:36:38.817443  4927 hierarchical_allocator_process.hpp:423]
> Removed
> > slave 201304160936-16842879-5050-4910-3
> > I0416 09:36:39.817735  4925 master.cpp:968] Attempting to register slave
> on
> > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> > I0416 09:36:39.818085  4925 master.cpp:1224] Master now considering a
> slave
> > at guodong-OptiPlex-990:46056 as active
> > I0416 09:36:39.818320  4925 master.cpp:1862] Adding slave
> > 201304160936-16842879-5050-4910-4 at guodong-OptiPlex-990 with cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611
> > I0416 09:36:39.818814  4925 master.cpp:537] Slave
> > 201304160936-16842879-5050-4910-4(guodong-OptiPlex-990) disconnected
> > I0416 09:36:39.818878  4927 hierarchical_allocator_process.hpp:395] Added
> > slave 201304160936-16842879-5050-4910-4 (guodong-OptiPlex-990) with
> cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> > ports=[31000-32000]; disk=399611 available)
> > I0416 09:36:39.818980  4925 master.cpp:542] Removing disconnected slave
> > 201304160936-16842879-5050-4910-4(guodong-OptiPlex-990) because it is not
> > checkpointing!
> > I0416 09:36:39.819612  4924 hierarchical_allocator_process.hpp:423]
> Removed
> > slave 201304160936-16842879-5050-4910-4
> > W0416 09:36:40.328641  4925 master.cpp:81] No whitelist given.
> Advertising
> > offers for all slaves
> > I0416 09:36:40.819702  4926 master.cpp:968] Attempting to register slave
> on
> > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> > I0416 09:36:40.820044  4926 master.cpp:1224] Master now considering a
> slave
> > at guodong-OptiPlex-990:46056 as active
> > I0416 09:36:40.820314  4926 master.cpp:1862] Adding slave
> > 201304160936-16842879-5050-4910-5 at guodong-OptiPlex-990 with cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611
> > I0416 09:36:40.820997  4926 master.cpp:537] Slave
> > 201304160936-16842879-5050-4910-5(guodong-OptiPlex-990) disconnected
> > I0416 09:36:40.821081  4924 hierarchical_allocator_process.hpp:395] Added
> > slave 201304160936-16842879-5050-4910-5 (guodong-OptiPlex-990) with
> cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> > ports=[31000-32000]; disk=399611 available)
> > I0416 09:36:40.821246  4926 master.cpp:542] Removing disconnected slave
> > 201304160936-16842879-5050-4910-5(guodong-OptiPlex-990) because it is not
> > checkpointing!
> > I0416 09:36:40.821862  4927 hierarchical_allocator_process.hpp:423]
> Removed
> > slave 201304160936-16842879-5050-4910-5
> > I0416 09:36:41.821005  4926 master.cpp:968] Attempting to register slave
> on
> > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> > I0416 09:36:41.821271  4926 master.cpp:1224] Master now considering a
> slave
> > at guodong-OptiPlex-990:46056 as active
> > I0416 09:36:41.821451  4926 master.cpp:1862] Adding slave
> > 201304160936-16842879-5050-4910-6 at guodong-OptiPlex-990 with cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611
> > I0416 09:36:41.821929  4926 master.cpp:537] Slave
> > 201304160936-16842879-5050-4910-6(guodong-OptiPlex-990) disconnected
> > I0416 09:36:41.822000  4924 hierarchical_allocator_process.hpp:395] Added
> > slave 201304160936-16842879-5050-4910-6 (guodong-OptiPlex-990) with
> cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> > ports=[31000-32000]; disk=399611 available)
> > I0416 09:36:41.822161  4926 master.cpp:542] Removing disconnected slave
> > 201304160936-16842879-5050-4910-6(guodong-OptiPlex-990) because it is not
> > checkpointing!
> > I0416 09:36:41.822906  4925 hierarchical_allocator_process.hpp:423]
> Removed
> > slave 201304160936-16842879-5050-4910-6
> > I0416 09:36:42.822779  4927 master.cpp:968] Attempting to register slave
> on
> > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> > I0416 09:36:42.823669  4927 master.cpp:1224] Master now considering a
> slave
> > at guodong-OptiPlex-990:46056 as active
> > I0416 09:36:42.824539  4927 master.cpp:1862] Adding slave
> > 201304160936-16842879-5050-4910-7 at guodong-OptiPlex-990 with cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611
> > I0416 09:36:42.892474  4927 master.cpp:537] Slave
> > 201304160936-16842879-5050-4910-7(guodong-OptiPlex-990) disconnected
> > I0416 09:36:42.908231  4927 master.cpp:542] Removing disconnected slave
> > 201304160936-16842879-5050-4910-7(guodong-OptiPlex-990) because it is not
> > checkpointing!
> > I0416 09:36:42.892504  4925 hierarchical_allocator_process.hpp:395] Added
> > slave 201304160936-16842879-5050-4910-7 (guodong-OptiPlex-990) with
> cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> > ports=[31000-32000]; disk=399611 available)
> > I0416 09:36:42.908664  4925 hierarchical_allocator_process.hpp:423]
> Removed
> > slave 201304160936-16842879-5050-4910-7
> > I0416 09:36:43.824604  4926 master.cpp:968] Attempting to register slave
> on
> > guodong-OptiPlex-990 at slave(1)@127.0.1.1:46056
> > I0416 09:36:43.825044  4926 master.cpp:1224] Master now considering a
> slave
> > at guodong-OptiPlex-990:46056 as active
> > I0416 09:36:43.825316  4926 master.cpp:1862] Adding slave
> > 201304160936-16842879-5050-4910-8 at guodong-OptiPlex-990 with cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611
> > I0416 09:36:43.825809  4926 master.cpp:537] Slave
> > 201304160936-16842879-5050-4910-8(guodong-OptiPlex-990) disconnected
> > I0416 09:36:43.825865  4927 hierarchical_allocator_process.hpp:395] Added
> > slave 201304160936-16842879-5050-4910-8 (guodong-OptiPlex-990) with
> cpus=4;
> > mem=10897; ports=[31000-32000]; disk=399611 (and cpus=4; mem=10897;
> > ports=[31000-32000]; disk=399611 available)
> > I0416 09:36:43.825969  4926 master.cpp:542] Removing disconnected slave
> > 201304160936-16842879-5050-4910-8(guodong-OptiPlex-990) because it is not
> > checkpointing!
> > I0416 09:36:43.826431  4924 hierarchical_allocator_process.hpp:423]
> Removed
> > slave 201304160936-16842879-5050-4910-8
> > F0416 09:36:44.198364  4928 process.cpp:1967] Check failed:
> > outgoing.count(s) > 0
> > *** Check failure stack trace: ***
> >     @     0x7f71eabed14d  google::LogMessage::Fail()
> >     @     0x7f71eabf09df  google::LogMessage::SendToLog()
> >     @     0x7f71eabeff17  google::LogMessage::Flush()
> >     @     0x7f71eabf0ebd  google::LogMessageFatal::~LogMessageFatal()
> >     @     0x7f71eab03207  process::SocketManager::next()
> >     @     0x7f71eab0765b  process::send_data()
> >     @     0x7f71eac2fdb1  ev_invoke_pending
> >     @     0x7f71eac348fd  ev_loop
> >     @     0x7f71eaafc70b  process::serve()
> >     @     0x7f71e8f0fe9a  start_thread
> >     @     0x7f71e8c3ccbd  (unknown)
> > Aborted (core dumped)
> >
> >
> > Guodong
> >
>

Reply via email to