> On June 12, 2014, 6:09 p.m., Ben Mahler wrote:
> > I think the subject is a bit off, should say "Reregister", not "Register", 
> > right?
> > 
> > Did you run this with repetition to see if it is flaky still?
> > 
> > $ ./bin/mesos-tests.sh 
> > --gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" 
> > --gtest_repeat=-1 --gtest_break_on_failure --verbose
> 
> Yifan Gu wrote:
>     Thanks for the cool advice. I run 
>     $ ./bin/mesos-tests.sh 
> --gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" 
> --gtest_repeat=-1 --gtest_break_on_failure --verbose
>     
>     And in the 13454th iteration, it gets a new error, looks like the master 
> failed to start.
>     
>     
>     Repeating all tests (iteration 13454) . . .
>     
>     Note: Google Test filter = 
> SlaveTest.TerminatingSlaveDoesNotReregister-CpuIsolatorTest/1.UserCpuUsage:CpuIsolatorTest/1.SystemCpuUsage:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota:MemIsolatorTest/0.MemUsage:MemIsolatorTest/1.MemUsage:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DISABLED_ROOT_RunTaskWithCommandInfoWithUser:ContainerizerTest.ROOT_CGROUPS_BalloonFramework:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Enabled:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Subsystems:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Mounted:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Get:CgroupsAnyHierarchyTest.ROOT_CGROUPS_NestedCgroups:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Cfs_Big_Quota:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Busy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_SubsystemsHierarchy:CgroupsAnyHierarchyWithCpuM
 
emoryTest.ROOT_CGROUPS_MountedSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_CreateRemove:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen:CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy:CgroupsAnyHierarchyWithCpuAcctMemoryTest.ROOT_CGROUPS_Stat:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Freeze:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Kill:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Destroy:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_AssignThreads:SlaveCount/Registrar_BENCHMARK_Test.performance/0:SlaveCount/Registrar_BENCHMARK_Test.performance/1:SlaveCount/Registrar_BENCHMARK_Test.performance/2:SlaveCount/Registrar_BENCHMARK_Test.performance/3:
>     [==========] Running 1 test from 1 test case.
>     [----------] Global test environment set-up.
>     [----------] 1 test from SlaveTest
>     [ RUN      ] SlaveTest.TerminatingSlaveDoesNotReregister
>     Using temporary directory 
> '/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_O9kh4V'
>     I0612 19:03:17.706805  2910 leveldb.cpp:176] Opened db in 15.704031ms
>     I0612 19:03:17.712888  2910 leveldb.cpp:183] Compacted db in 6.057101ms
>     I0612 19:03:17.712910  2910 leveldb.cpp:198] Created db iterator in 2075ns
>     I0612 19:03:17.712920  2910 leveldb.cpp:204] Seeked to beginning of db in 
> 365ns
>     I0612 19:03:17.712929  2910 leveldb.cpp:273] Iterated through 0 keys in 
> the db in 96ns
>     I0612 19:03:17.712939  2910 replica.cpp:741] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
>     I0612 19:03:17.713034  2933 recover.cpp:425] Starting replica recovery
>     I0612 19:03:17.713165  2925 recover.cpp:451] Replica is in EMPTY status
>     I0612 19:03:17.713366  2925 replica.cpp:638] Replica in EMPTY status 
> received a broadcasted recover request
>     I0612 19:03:17.713471  2924 master.cpp:280] Master 
> 20140612-190317-3823062160-44846-2910 (chimney.mesosphere.io) started on 
> 144.76.223.227:44846
>     I0612 19:03:17.713497  2924 master.cpp:317] Master only allowing 
> authenticated frameworks to register
>     I0612 19:03:17.713507  2924 master.cpp:322] Master only allowing 
> authenticated slaves to register
>     I0612 19:03:17.713515  2924 credentials.hpp:35] Loading credentials for 
> authentication from 
> '/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_O9kh4V/credentials'
>     I0612 19:03:17.713517  2933 recover.cpp:188] Received a recover response 
> from a replica in EMPTY status
>     I0612 19:03:17.713564  2924 master.cpp:348] Authorization enabled
>     I0612 19:03:17.713625  2928 recover.cpp:542] Updating replica status to 
> STARTING
>     I0612 19:03:17.713819  2933 master.cpp:961] The newly elected leader is 
> [email protected]:44846 with id 20140612-190317-3823062160-44846-2910
>     I0612 19:03:17.719408  2934 leveldb.cpp:306] Persisting metadata (8 
> bytes) to leveldb took 5.73482ms
>     I0612 19:03:32.107343  2933 master.cpp:974] Elected as the leading master!
>     I0612 19:03:32.107364  2934 replica.cpp:320] Persisted replica status to 
> STARTING
>     F0612 19:03:27.714102  2910 cluster.hpp:427] Failed to wait for _recover
>     *** Check failure stack trace: ***
>     I0612 19:03:32.107374  2933 master.cpp:792] Recovering from registrar
>     I0612 19:03:32.107522  2934 recover.cpp:451] Replica is in STARTING status
>     I0612 19:03:32.107746  2929 registrar.cpp:313] Recovering registrar
>     I0612 19:03:32.108326  2925 replica.cpp:638] Replica in STARTING status 
> received a broadcasted recover request
>     I0612 19:03:32.108497  2931 recover.cpp:188] Received a recover response 
> from a replica in STARTING status
>     I0612 19:03:32.108778  2929 recover.cpp:542] Updating replica status to 
> VOTING
>         @     0x7f4c0cc3dc3d  google::LogMessage::Fail()
>         @     0x7f4c0cc3fa7d  google::LogMessage::SendToLog()
>         @     0x7f4c0cc3d82c  google::LogMessage::Flush()
>         @     0x7f4c0cc40379  google::LogMessageFatal::~LogMessageFatal()
>         @           0x73b9db  
> mesos::internal::tests::Cluster::Masters::start()
>         @           0x736885  mesos::internal::tests::MesosTest::StartMaster()
>         @           0x826fbf  
> SlaveTest_TerminatingSlaveDoesNotReregister_Test::TestBody()
>         @           0x8cfbb3  
> testing::internal::HandleExceptionsInMethodIfSupported<>()
>         @           0x8c8e87  testing::Test::Run()
>         @           0x8c8f2e  testing::TestInfo::Run()
>         @           0x8c9035  testing::TestCase::Run()
>         @           0x8c92d8  testing::internal::UnitTestImpl::RunAllTests()
>     I0612 19:03:32.117660  2932 leveldb.cpp:306] Persisting metadata (8 
> bytes) to leveldb took 8.736907ms
>     I0612 19:03:32.117678  2932 replica.cpp:320] Persisted replica status to 
> VOTING
>     I0612 19:03:32.117710  2931 recover.cpp:556] Successfully joined the 
> Paxos group
>         @           0x8c9577  testing::UnitTest::Run()
>     I0612 19:03:32.117769  2931 recover.cpp:440] Recover process terminated
>         @           0x48b01d  main
>     I0612 19:03:32.117884  2928 log.cpp:656] Attempting to start the writer
>         @     0x7f4c0af73de5  (unknown)
>     I0612 19:03:32.118140  2929 replica.cpp:474] Replica received implicit 
> promise request with proposal 1
>         @           0x498144  (unknown)
>     Aborted
>     
>     
>

Thanks Yifan, that looks like an orthogonal issue (strange that the master took 
more than 10 seconds to realize it was elected).

Will get this committed for you.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22472/#review45516
-----------------------------------------------------------


On June 12, 2014, 7:15 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22472/
> -----------------------------------------------------------
> 
> (Updated June 12, 2014, 7:15 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.
> 
> 
> Bugs: MESOS-1460
>     https://issues.apache.org/jira/browse/MESOS-1460
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Ignored subsequent status updates.
> Muted warnings by catching mock calls.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22472/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>

Reply via email to