> On June 12, 2014, 6:09 p.m., Ben Mahler wrote:
> > I think the subject is a bit off, should say "Reregister", not "Register", 
> > right?
> > 
> > Did you run this with repetition to see if it is flaky still?
> > 
> > $ ./bin/mesos-tests.sh 
> > --gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" 
> > --gtest_repeat=-1 --gtest_break_on_failure --verbose

Thanks for the cool advice. I run 
$ ./bin/mesos-tests.sh 
--gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" --gtest_repeat=-1 
--gtest_break_on_failure --verbose

And in the 13454th iteration, it gets a new error, looks like the master failed 
to start.


Repeating all tests (iteration 13454) . . .

Note: Google Test filter = 
SlaveTest.TerminatingSlaveDoesNotReregister-CpuIsolatorTest/1.UserCpuUsage:CpuIsolatorTest/1.SystemCpuUsage:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota:MemIsolatorTest/0.MemUsage:MemIsolatorTest/1.MemUsage:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DISABLED_ROOT_RunTaskWithCommandInfoWithUser:ContainerizerTest.ROOT_CGROUPS_BalloonFramework:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Enabled:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Subsystems:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Mounted:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Get:CgroupsAnyHierarchyTest.ROOT_CGROUPS_NestedCgroups:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Cfs_Big_Quota:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Busy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_SubsystemsHierarchy:CgroupsAnyHierarchyWithCpuMemoryT
 
est.ROOT_CGROUPS_MountedSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_CreateRemove:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen:CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy:CgroupsAnyHierarchyWithCpuAcctMemoryTest.ROOT_CGROUPS_Stat:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Freeze:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Kill:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Destroy:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_AssignThreads:SlaveCount/Registrar_BENCHMARK_Test.performance/0:SlaveCount/Registrar_BENCHMARK_Test.performance/1:SlaveCount/Registrar_BENCHMARK_Test.performance/2:SlaveCount/Registrar_BENCHMARK_Test.performance/3:
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from SlaveTest
[ RUN      ] SlaveTest.TerminatingSlaveDoesNotReregister
Using temporary directory 
'/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_O9kh4V'
I0612 19:03:17.706805  2910 leveldb.cpp:176] Opened db in 15.704031ms
I0612 19:03:17.712888  2910 leveldb.cpp:183] Compacted db in 6.057101ms
I0612 19:03:17.712910  2910 leveldb.cpp:198] Created db iterator in 2075ns
I0612 19:03:17.712920  2910 leveldb.cpp:204] Seeked to beginning of db in 365ns
I0612 19:03:17.712929  2910 leveldb.cpp:273] Iterated through 0 keys in the db 
in 96ns
I0612 19:03:17.712939  2910 replica.cpp:741] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0612 19:03:17.713034  2933 recover.cpp:425] Starting replica recovery
I0612 19:03:17.713165  2925 recover.cpp:451] Replica is in EMPTY status
I0612 19:03:17.713366  2925 replica.cpp:638] Replica in EMPTY status received a 
broadcasted recover request
I0612 19:03:17.713471  2924 master.cpp:280] Master 
20140612-190317-3823062160-44846-2910 (chimney.mesosphere.io) started on 
144.76.223.227:44846
I0612 19:03:17.713497  2924 master.cpp:317] Master only allowing authenticated 
frameworks to register
I0612 19:03:17.713507  2924 master.cpp:322] Master only allowing authenticated 
slaves to register
I0612 19:03:17.713515  2924 credentials.hpp:35] Loading credentials for 
authentication from 
'/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_O9kh4V/credentials'
I0612 19:03:17.713517  2933 recover.cpp:188] Received a recover response from a 
replica in EMPTY status
I0612 19:03:17.713564  2924 master.cpp:348] Authorization enabled
I0612 19:03:17.713625  2928 recover.cpp:542] Updating replica status to STARTING
I0612 19:03:17.713819  2933 master.cpp:961] The newly elected leader is 
[email protected]:44846 with id 20140612-190317-3823062160-44846-2910
I0612 19:03:17.719408  2934 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 5.73482ms
I0612 19:03:32.107343  2933 master.cpp:974] Elected as the leading master!
I0612 19:03:32.107364  2934 replica.cpp:320] Persisted replica status to 
STARTING
F0612 19:03:27.714102  2910 cluster.hpp:427] Failed to wait for _recover
*** Check failure stack trace: ***
I0612 19:03:32.107374  2933 master.cpp:792] Recovering from registrar
I0612 19:03:32.107522  2934 recover.cpp:451] Replica is in STARTING status
I0612 19:03:32.107746  2929 registrar.cpp:313] Recovering registrar
I0612 19:03:32.108326  2925 replica.cpp:638] Replica in STARTING status 
received a broadcasted recover request
I0612 19:03:32.108497  2931 recover.cpp:188] Received a recover response from a 
replica in STARTING status
I0612 19:03:32.108778  2929 recover.cpp:542] Updating replica status to VOTING
    @     0x7f4c0cc3dc3d  google::LogMessage::Fail()
    @     0x7f4c0cc3fa7d  google::LogMessage::SendToLog()
    @     0x7f4c0cc3d82c  google::LogMessage::Flush()
    @     0x7f4c0cc40379  google::LogMessageFatal::~LogMessageFatal()
    @           0x73b9db  mesos::internal::tests::Cluster::Masters::start()
    @           0x736885  mesos::internal::tests::MesosTest::StartMaster()
    @           0x826fbf  
SlaveTest_TerminatingSlaveDoesNotReregister_Test::TestBody()
    @           0x8cfbb3  
testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0x8c8e87  testing::Test::Run()
    @           0x8c8f2e  testing::TestInfo::Run()
    @           0x8c9035  testing::TestCase::Run()
    @           0x8c92d8  testing::internal::UnitTestImpl::RunAllTests()
I0612 19:03:32.117660  2932 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 8.736907ms
I0612 19:03:32.117678  2932 replica.cpp:320] Persisted replica status to VOTING
I0612 19:03:32.117710  2931 recover.cpp:556] Successfully joined the Paxos group
    @           0x8c9577  testing::UnitTest::Run()
I0612 19:03:32.117769  2931 recover.cpp:440] Recover process terminated
    @           0x48b01d  main
I0612 19:03:32.117884  2928 log.cpp:656] Attempting to start the writer
    @     0x7f4c0af73de5  (unknown)
I0612 19:03:32.118140  2929 replica.cpp:474] Replica received implicit promise 
request with proposal 1
    @           0x498144  (unknown)
Aborted


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22472/#review45516
-----------------------------------------------------------


On June 12, 2014, 7:15 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22472/
> -----------------------------------------------------------
> 
> (Updated June 12, 2014, 7:15 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.
> 
> 
> Bugs: MESOS-1460
>     https://issues.apache.org/jira/browse/MESOS-1460
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Ignored subsequent status updates.
> Muted warnings by catching mock calls.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22472/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>

Reply via email to