[
https://issues.apache.org/jira/browse/MESOS-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506708#comment-16506708
]
Chun-Hung Hsiao edited comment on MESOS-4062 at 6/8/18 11:47 PM:
-----------------------------------------------------------------
Observed another type of flakiness on this test: an 1-hr hang:
[^consoleText.txt]. Also see MESOS-6231.
was (Author: chhsia0):
Observed another type of flakiness on this test. See MESOS-6231.
> SlaveTest.MetricsSlaveLaunchErrors is flaky
> -------------------------------------------
>
> Key: MESOS-4062
> URL: https://issues.apache.org/jira/browse/MESOS-4062
> Project: Mesos
> Issue Type: Bug
> Components: test
> Reporter: Vinod Kone
> Priority: Major
> Labels: flaky-test, test
> Attachments: consoleText.txt
>
>
> Observed this in internal CI
> {code}
> DEBUG: [ RUN ] SlaveTest.MetricsSlaveLaunchErrors
> DEBUG: I1126 10:07:30.408905 33823 leveldb.cpp:174] Opened db in 412.443316ms
> DEBUG: I1126 10:07:30.559861 33823 leveldb.cpp:181] Compacted db in
> 150.905148ms
> DEBUG: I1126 10:07:30.559921 33823 leveldb.cpp:196] Created db iterator in
> 6415ns
> DEBUG: I1126 10:07:30.559936 33823 leveldb.cpp:202] Seeked to beginning of db
> in 1135ns
> DEBUG: I1126 10:07:30.559944 33823 leveldb.cpp:271] Iterated through 0 keys
> in the db in 320ns
> DEBUG: I1126 10:07:30.559967 33823 replica.cpp:778] Replica recovered with
> log positions 0 -> 0 with 1 holes and 0 unlearned
> DEBUG: I1126 10:07:30.560356 33841 recover.cpp:447] Starting replica recovery
> DEBUG: I1126 10:07:30.561079 33845 recover.cpp:473] Replica is in EMPTY status
> DEBUG: I1126 10:07:30.561751 33842 replica.cpp:674] Replica in EMPTY status
> received a broadcasted recover request from (8555)@a.b.c.d:33492
> DEBUG: I1126 10:07:30.561903 33847 recover.cpp:193] Received a recover
> response from a replica in EMPTY status
> DEBUG: I1126 10:07:30.562270 33857 recover.cpp:564] Updating replica status
> to STARTING
> DEBUG: I1126 10:07:30.563334 33844 master.cpp:365] Master
> dab6e888-2e40-45e4-9595-051febe7dc16 (x.com) started on a.b.c.d:33492
> DEBUG: I1126 10:07:30.563357 33844 master.cpp:367] Flags at startup:
> --acls="" --allocation_interval="1secs" --allocator="HierarchicalDRF"
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5"
> --authorizers="local" --credentials="/tmp/2xHAZi/credentials"
> --framework_sorter="drf" --help="false" --hostname_lookup="true"
> --initialize_driver_logging="true" --log_auto_initialize="true"
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5"
> --quiet="false" --recovery_slave_removal_limit="100%"
> --registry="replicated_log" --registry_fetch_timeout="1mins"
> --registry_store_timeout="25secs" --registry_strict="true"
> --root_submissions="true" --slave_ping_timeout="15secs"
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false"
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/2xHAZi/master"
> --zk_session_timeout="10secs"
> DEBUG: I1126 10:07:30.563581 33844 master.cpp:412] Master only allowing
> authenticated frameworks to register
> DEBUG: I1126 10:07:30.563596 33844 master.cpp:417] Master only allowing
> authenticated slaves to register
> DEBUG: I1126 10:07:30.563604 33844 credentials.hpp:35] Loading credentials
> for authentication from '/tmp/2xHAZi/credentials'
> DEBUG: I1126 10:07:30.564956 33844 master.cpp:456] Using default 'crammd5'
> authenticator
> DEBUG: I1126 10:07:30.565018 33844 master.cpp:493] Authorization enabled
> DEBUG: I1126 10:07:30.565817 33849 master.cpp:1625] The newly elected leader
> is [email protected]:33492 with id dab6e888-2e40-45e4-9595-051febe7dc16
> DEBUG: I1126 10:07:30.565834 33849 master.cpp:1638] Elected as the leading
> master!
> DEBUG: I1126 10:07:30.565841 33849 master.cpp:1383] Recovering from registrar
> DEBUG: I1126 10:07:30.565906 33844 registrar.cpp:307] Recovering registrar
> DEBUG: I1126 10:07:30.676245 33859 leveldb.cpp:304] Persisting metadata (8
> bytes) to leveldb took 113.597908ms
> DEBUG: I1126 10:07:30.676311 33859 replica.cpp:321] Persisted replica status
> to STARTING
> DEBUG: I1126 10:07:30.676458 33851 recover.cpp:473] Replica is in STARTING
> status
> DEBUG: I1126 10:07:30.677732 33854 replica.cpp:674] Replica in STARTING
> status received a broadcasted recover request from (8556)@a.b.c.d:33492
> DEBUG: I1126 10:07:30.677914 33846 recover.cpp:193] Received a recover
> response from a replica in STARTING status
> DEBUG: I1126 10:07:30.678324 33862 recover.cpp:564] Updating replica status
> to VOTING
> DEBUG: I1126 10:07:30.777760 33851 leveldb.cpp:304] Persisting metadata (8
> bytes) to leveldb took 98.972607ms
> DEBUG: I1126 10:07:30.777823 33851 replica.cpp:321] Persisted replica status
> to VOTING
> DEBUG: I1126 10:07:30.777905 33851 recover.cpp:578] Successfully joined the
> Paxos group
> DEBUG: I1126 10:07:30.777976 33851 recover.cpp:462] Recover process terminated
> DEBUG: I1126 10:07:30.778120 33846 log.cpp:659] Attempting to start the writer
> DEBUG: I1126 10:07:30.779547 33856 replica.cpp:494] Replica received implicit
> promise request from (8557)@a.b.c.d:33492 with proposal 1
> DEBUG: I1126 10:07:30.879788 33856 leveldb.cpp:304] Persisting metadata (8
> bytes) to leveldb took 100.197892ms
> DEBUG: I1126 10:07:30.879844 33856 replica.cpp:343] Persisted promised to 1
> DEBUG: I1126 10:07:30.880520 33851 coordinator.cpp:238] Coordinator
> attempting to fill missing positions
> DEBUG: I1126 10:07:30.883019 33864 replica.cpp:389] Replica received explicit
> promise request from (8558)@a.b.c.d:33492 for position 0 with proposal 2
> DEBUG: I1126 10:07:31.017720 33864 leveldb.cpp:341] Persisting action (8
> bytes) to leveldb took 134.647958ms
> DEBUG: I1126 10:07:31.017788 33864 replica.cpp:713] Persisted action at 0
> DEBUG: I1126 10:07:31.019820 33854 replica.cpp:538] Replica received write
> request for position 0 from (8559)@a.b.c.d:33492
> DEBUG: I1126 10:07:31.019881 33854 leveldb.cpp:436] Reading position from
> leveldb took 21951ns
> DEBUG: I1126 10:07:31.125607 33854 leveldb.cpp:341] Persisting action (14
> bytes) to leveldb took 105.695296ms
> DEBUG: I1126 10:07:31.125684 33854 replica.cpp:713] Persisted action at 0
> DEBUG: I1126 10:07:31.126045 33855 replica.cpp:692] Replica received learned
> notice for position 0 from @0.0.0.0:0
> DEBUG: I1126 10:07:31.378890 33855 leveldb.cpp:341] Persisting action (16
> bytes) to leveldb took 252.798894ms
> DEBUG: I1126 10:07:31.378969 33855 replica.cpp:713] Persisted action at 0
> DEBUG: I1126 10:07:31.378989 33855 replica.cpp:698] Replica learned NOP
> action at position 0
> DEBUG: I1126 10:07:31.379464 33856 log.cpp:675] Writer started with ending
> position 0
> DEBUG: I1126 10:07:31.380209 33843 leveldb.cpp:436] Reading position from
> leveldb took 29715ns
> DEBUG: I1126 10:07:31.380609 33842 registrar.cpp:340] Successfully fetched
> the registry (0B) in 814.676992ms
> DEBUG: I1126 10:07:31.380666 33842 registrar.cpp:439] Applied 1 operations in
> 4717ns; attempting to update the 'registry'
> DEBUG: I1126 10:07:31.380982 33842 log.cpp:683] Attempting to append 228
> bytes to the log
> DEBUG: I1126 10:07:31.381053 33842 coordinator.cpp:348] Coordinator
> attempting to write APPEND action at position 1
> DEBUG: I1126 10:07:31.381356 33842 replica.cpp:538] Replica received write
> request for position 1 from (8560)@a.b.c.d:33492
> DEBUG: 2015-11-26
> 10:07:31,923:33823(0x7f5e2a7fc700):ZOO_ERROR@handle_socket_error_msg@1697:
> Socket [127.0.0.1:57854] zk retcode=-4, errno=111(Connection refused): server
> refused to accept the client
> DEBUG: I1126 10:07:32.385574 33842 leveldb.cpp:341] Persisting action (247
> bytes) to leveldb took 1.004179569secs
> DEBUG: I1126 10:07:32.385628 33842 replica.cpp:713] Persisted action at 1
> DEBUG: I1126 10:07:32.385962 33842 replica.cpp:692] Replica received learned
> notice for position 1 from @0.0.0.0:0
> DEBUG: I1126 10:07:32.886503 33842 leveldb.cpp:341] Persisting action (249
> bytes) to leveldb took 500.504268ms
> DEBUG: I1126 10:07:32.886557 33842 replica.cpp:713] Persisted action at 1
> DEBUG: I1126 10:07:32.886567 33842 replica.cpp:698] Replica learned APPEND
> action at position 1
> DEBUG: I1126 10:07:32.887076 33844 registrar.cpp:484] Successfully updated
> the 'registry' in 1.506383104secs
> DEBUG: I1126 10:07:32.887090 33842 log.cpp:702] Attempting to truncate the
> log to 1
> DEBUG: I1126 10:07:32.887162 33844 registrar.cpp:370] Successfully recovered
> registrar
> DEBUG: I1126 10:07:32.887202 33848 coordinator.cpp:348] Coordinator
> attempting to write TRUNCATE action at position 2
> DEBUG: I1126 10:07:32.888511 33861 master.cpp:1435] Recovered 0 slaves from
> the Registry (189B) ; allowing 10mins for slaves to re-register
> DEBUG: I1126 10:07:32.888545 33845 replica.cpp:538] Replica received write
> request for position 2 from (8561)@a.b.c.d:33492
> DEBUG: I1126 10:07:32.888587 33850 hierarchical.cpp:174] Allocator recovery
> is not supported yet
> DEBUG: I1126 10:07:34.422766 33845 leveldb.cpp:341] Persisting action (16
> bytes) to leveldb took 1.534182617secs
> DEBUG: I1126 10:07:34.422826 33845 replica.cpp:713] Persisted action at 2
> DEBUG: I1126 10:07:34.423280 33848 replica.cpp:692] Replica received learned
> notice for position 2 from @0.0.0.0:0
> DEBUG: 2015-11-26
> 10:07:35,260:33823(0x7f5e2a7fc700):ZOO_ERROR@handle_socket_error_msg@1697:
> Socket [127.0.0.1:57854] zk retcode=-4, errno=111(Connection refused): server
> refused to accept the client
> DEBUG: I1126 10:07:36.023536 33848 leveldb.cpp:341] Persisting action (18
> bytes) to leveldb took 1.600200148secs
> DEBUG: I1126 10:07:36.023653 33848 leveldb.cpp:399] Deleting ~1 keys from
> leveldb took 49874ns
> DEBUG: I1126 10:07:36.023669 33848 replica.cpp:713] Persisted action at 2
> DEBUG: I1126 10:07:36.023680 33848 replica.cpp:698] Replica learned TRUNCATE
> action at position 2
> DEBUG: I1126 10:07:36.028640 33862 slave.cpp:189] Slave started on
> 313)@a.b.c.d:33492
> DEBUG: I1126 10:07:36.028666 33862 slave.cpp:190] Flags at startup:
> --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5"
> --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
> --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false"
> --cgroups_root="mesos" --container_disk_watch_interval="15secs"
> --containerizers="mesos"
> --credential="/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/credential"
> --default_role="*" --disk_watch_interval="1mins" --docker="docker"
> --docker_auth_server="auth.docker.io" --docker_auth_server_port="443"
> --docker_kill_orphans="true"
> --docker_local_archives_dir="/tmp/mesos/images/docker"
> --docker_puller="local" --docker_puller_timeout="60"
> --docker_registry="registry-1.docker.io" --docker_registry_port="443"
> --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock"
> --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker"
> --egress_unique_flow_per_container="false"
> --enforce_container_disk_quota="false" --ephemeral_ports_per_container="1024"
> --executor_registration_timeout="1mins"
> --executor_shutdown_grace_period="5secs"
> --fetcher_cache_dir="/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/fetch"
> --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks"
> --gc_disk_headroom="0.1" --hadoop_home="" --help="false"
> --hostname_lookup="true" --image_provisioner_backend="copy"
> --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem"
> --launcher_dir="/builddir/build/BUILD/mesos-0.27.0/src" --logbufsecs="0"
> --logging_level="INFO" --network_enable_socket_statistics_details="false"
> --network_enable_socket_statistics_summary="false"
> --oversubscribed_resources_interval="15secs" --perf_duration="10secs"
> --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false"
> --recover="reconnect" --recovery_timeout="15mins"
> --registration_backoff_factor="10ms"
> --resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]"
> --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox"
> --strict="true" --switch_user="true"
> --systemd_runtime_directory="/run/systemd/system" --version="false"
> --work_dir="/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9"
> DEBUG: I1126 10:07:36.029013 33862 credentials.hpp:83] Loading credential for
> authentication from
> '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/credential'
> DEBUG: I1126 10:07:36.029131 33862 slave.cpp:320] Slave using credential for:
> test-principal
> DEBUG: I1126 10:07:36.029939 33862 slave.cpp:390] Slave resources: cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> DEBUG: I1126 10:07:36.029979 33862 slave.cpp:398] Slave attributes: [ ]
> DEBUG: I1126 10:07:36.029989 33862 slave.cpp:403] Slave hostname: x.com
> DEBUG: I1126 10:07:36.029999 33862 slave.cpp:408] Slave checkpoint: true
> DEBUG: I1126 10:07:36.030452 33852 state.cpp:52] Recovering state from
> '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/meta'
> DEBUG: I1126 10:07:36.030635 33849 status_update_manager.cpp:200] Recovering
> status update manager
> DEBUG: I1126 10:07:36.030804 33861 slave.cpp:4258] Finished recovery
> DEBUG: I1126 10:07:36.030957 33823 sched.cpp:164] Version: 0.27.0-rc0
> DEBUG: I1126 10:07:36.031323 33860 sched.cpp:262] New master detected at
> [email protected]:33492
> DEBUG: I1126 10:07:36.031355 33860 sched.cpp:318] Authenticating with master
> [email protected]:33492
> DEBUG: I1126 10:07:36.031363 33860 sched.cpp:325] Using default CRAM-MD5
> authenticatee
> DEBUG: I1126 10:07:36.031898 33846 authenticatee.cpp:121] Creating new client
> SASL connection
> DEBUG: I1126 10:07:36.032002 33846 master.cpp:5169] Authenticating
> [email protected]:33492
> DEBUG: I1126 10:07:36.032591 33859 authenticator.cpp:98] Creating new server
> SASL connection
> DEBUG: I1126 10:07:36.032668 33859 authenticatee.cpp:212] Received SASL
> authentication mechanisms: CRAM-MD5
> DEBUG: I1126 10:07:36.032685 33859 authenticatee.cpp:238] Attempting to
> authenticate with mechanism 'CRAM-MD5'
> DEBUG: I1126 10:07:36.032744 33859 authenticator.cpp:203] Received SASL
> authentication start
> DEBUG: I1126 10:07:36.032814 33859 authenticator.cpp:325] Authentication
> requires more steps
> DEBUG: I1126 10:07:36.032850 33859 authenticatee.cpp:258] Received SASL
> authentication step
> DEBUG: I1126 10:07:36.033195 33848 authenticator.cpp:231] Received SASL
> authentication step
> DEBUG: I1126 10:07:36.033231 33848 authenticator.cpp:317] Authentication
> success
> DEBUG: I1126 10:07:36.033293 33849 authenticatee.cpp:298] Authentication
> success
> DEBUG: I1126 10:07:36.033668 33851 master.cpp:5199] Successfully
> authenticated principal 'test-principal' at
> [email protected]:33492
> DEBUG: I1126 10:07:36.033880 33856 sched.cpp:407] Successfully authenticated
> with master [email protected]:33492
> DEBUG: I1126 10:07:36.034004 33842 master.cpp:2195] Received SUBSCRIBE call
> for framework 'default' at
> [email protected]:33492
> DEBUG: I1126 10:07:36.034039 33842 master.cpp:1664] Authorizing framework
> principal 'test-principal' to receive offers for role '*'
> DEBUG: I1126 10:07:36.034314 33845 master.cpp:2266] Subscribing framework
> default with checkpointing disabled and capabilities [ ]
> DEBUG: I1126 10:07:36.034454 33845 sched.cpp:641] Framework registered with
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:36.034539 33850 hierarchical.cpp:220] Added framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:36.264459 33844 slave.cpp:727] New master detected at
> [email protected]:33492
> DEBUG: I1126 10:07:36.264487 33844 slave.cpp:790] Authenticating with master
> [email protected]:33492
> DEBUG: I1126 10:07:36.264494 33844 slave.cpp:795] Using default CRAM-MD5
> authenticatee
> DEBUG: I1126 10:07:36.264529 33844 slave.cpp:763] Detecting new master
> DEBUG: I1126 10:07:36.264536 33856 status_update_manager.cpp:174] Pausing
> sending status updates
> DEBUG: I1126 10:07:36.264717 33842 authenticatee.cpp:121] Creating new client
> SASL connection
> DEBUG: I1126 10:07:36.265007 33844 master.cpp:5169] Authenticating
> slave(313)@a.b.c.d:33492
> DEBUG: I1126 10:07:36.265116 33855 authenticator.cpp:98] Creating new server
> SASL connection
> DEBUG: I1126 10:07:36.265190 33855 authenticatee.cpp:212] Received SASL
> authentication mechanisms: CRAM-MD5
> DEBUG: I1126 10:07:36.265207 33855 authenticatee.cpp:238] Attempting to
> authenticate with mechanism 'CRAM-MD5'
> DEBUG: I1126 10:07:36.265398 33850 authenticator.cpp:203] Received SASL
> authentication start
> DEBUG: I1126 10:07:36.265527 33850 authenticator.cpp:325] Authentication
> requires more steps
> DEBUG: I1126 10:07:36.265702 33860 authenticatee.cpp:258] Received SASL
> authentication step
> DEBUG: I1126 10:07:36.265918 33860 authenticator.cpp:231] Received SASL
> authentication step
> DEBUG: I1126 10:07:36.266005 33860 authenticator.cpp:317] Authentication
> success
> DEBUG: I1126 10:07:36.266124 33847 authenticatee.cpp:298] Authentication
> success
> DEBUG: I1126 10:07:36.266181 33853 master.cpp:5199] Successfully
> authenticated principal 'test-principal' at slave(313)@a.b.c.d:33492
> DEBUG: I1126 10:07:36.266787 33853 slave.cpp:858] Successfully authenticated
> with master [email protected]:33492
> DEBUG: I1126 10:07:36.266958 33863 master.cpp:3878] Registering slave at
> slave(313)@a.b.c.d:33492 (x.com) with id
> dab6e888-2e40-45e4-9595-051febe7dc16-S0
> DEBUG: I1126 10:07:36.267105 33858 registrar.cpp:439] Applied 1 operations in
> 22us; attempting to update the 'registry'
> DEBUG: I1126 10:07:36.267534 33848 log.cpp:683] Attempting to append 419
> bytes to the log
> DEBUG: I1126 10:07:36.267640 33860 coordinator.cpp:348] Coordinator
> attempting to write APPEND action at position 3
> DEBUG: I1126 10:07:36.268745 33857 replica.cpp:538] Replica received write
> request for position 3 from (8566)@a.b.c.d:33492
> DEBUG: I1126 10:07:36.282333 33852 master.cpp:3866] Ignoring register slave
> message from slave(313)@a.b.c.d:33492 (x.com) as admission is already in
> progress
> DEBUG: I1126 10:07:36.290721 33851 master.cpp:3866] Ignoring register slave
> message from slave(313)@a.b.c.d:33492 (x.com) as admission is already in
> progress
> DEBUG: I1126 10:07:36.315064 33851 master.cpp:3866] Ignoring register slave
> message from slave(313)@a.b.c.d:33492 (x.com) as admission is already in
> progress
> DEBUG: I1126 10:07:36.361470 33852 master.cpp:3866] Ignoring register slave
> message from slave(313)@a.b.c.d:33492 (x.com) as admission is already in
> progress
> DEBUG: I1126 10:07:36.678863 33856 master.cpp:3866] Ignoring register slave
> message from slave(313)@a.b.c.d:33492 (x.com) as admission is already in
> progress
> DEBUG: I1126 10:07:37.008486 33851 master.cpp:3866] Ignoring register slave
> message from slave(313)@a.b.c.d:33492 (x.com) as admission is already in
> progress
> DEBUG: I1126 10:07:37.397876 33852 master.cpp:3866] Ignoring register slave
> message from slave(313)@a.b.c.d:33492 (x.com) as admission is already in
> progress
> DEBUG: 2015-11-26
> 10:07:38,597:33823(0x7f5e2a7fc700):ZOO_ERROR@handle_socket_error_msg@1697:
> Socket [127.0.0.1:57854] zk retcode=-4, errno=111(Connection refused): server
> refused to accept the client
> DEBUG: I1126 10:07:38.727941 33857 leveldb.cpp:341] Persisting action (438
> bytes) to leveldb took 2.459156189secs
> DEBUG: I1126 10:07:38.728011 33857 replica.cpp:713] Persisted action at 3
> DEBUG: I1126 10:07:38.728584 33860 replica.cpp:692] Replica received learned
> notice for position 3 from @0.0.0.0:0
> DEBUG: I1126 10:07:39.243325 33847 master.cpp:3866] Ignoring register slave
> message from slave(313)@a.b.c.d:33492 (x.com) as admission is already in
> progress
> DEBUG: I1126 10:07:40.169939 33860 leveldb.cpp:341] Persisting action (440
> bytes) to leveldb took 1.441303994secs
> DEBUG: I1126 10:07:40.169996 33860 replica.cpp:713] Persisted action at 3
> DEBUG: I1126 10:07:40.170007 33860 replica.cpp:698] Replica learned APPEND
> action at position 3
> DEBUG: I1126 10:07:40.170461 33860 registrar.cpp:484] Successfully updated
> the 'registry' in 3.903270144secs
> DEBUG: I1126 10:07:40.170742 33855 log.cpp:702] Attempting to truncate the
> log to 3
> DEBUG: I1126 10:07:40.170810 33841 coordinator.cpp:348] Coordinator
> attempting to write TRUNCATE action at position 4
> DEBUG: I1126 10:07:40.170925 33844 master.cpp:3946] Registered slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0 at slave(313)@a.b.c.d:33492 (x.com)
> with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> DEBUG: I1126 10:07:40.170999 33863 slave.cpp:902] Registered with master
> [email protected]:33492; given slave ID dab6e888-2e40-45e4-9595-051febe7dc16-S0
> DEBUG: I1126 10:07:40.171005 33842 hierarchical.cpp:380] Added slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0 (x.com) with cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000] (allocated: )
> DEBUG: I1126 10:07:40.171119 33845 status_update_manager.cpp:181] Resuming
> sending status updates
> DEBUG: I1126 10:07:40.171352 33844 replica.cpp:538] Replica received write
> request for position 4 from (8567)@a.b.c.d:33492
> DEBUG: I1126 10:07:40.171455 33851 master.cpp:4998] Sending 1 offers to
> framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 (default) at
> [email protected]:33492
> DEBUG: I1126 10:07:40.310534 33863 slave.cpp:961] Forwarding total
> oversubscribed resources
> DEBUG: I1126 10:07:40.310897 33852 master.cpp:4288] Received update of slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0 at slave(313)@a.b.c.d:33492 (x.com)
> with total oversubscribed resources
> DEBUG: I1126 10:07:40.311044 33852 hierarchical.cpp:436] Slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0 (x.com) updated with oversubscribed
> resources (total: cpus(*):2; mem(*):1024; disk(*):1024;
> ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024;
> ports(*):[31000-32000])
> DEBUG: I1126 10:07:40.314579 33862 master.cpp:2934] Processing ACCEPT call
> for offers: [ dab6e888-2e40-45e4-9595-051febe7dc16-O0 ] on slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0 at slave(313)@a.b.c.d:33492 (x.com)
> for framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 (default) at
> [email protected]:33492
> DEBUG: I1126 10:07:40.314610 33862 master.cpp:2730] Authorizing framework
> principal 'test-principal' to launch task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 as user 'mockbuild'
> DEBUG: W1126 10:07:40.315390 33859 validation.cpp:420] Executor default for
> task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 uses less CPUs (None) than the
> minimum required (0.01). Please update your executor, as this will be
> mandatory in future releases.
> DEBUG: W1126 10:07:40.315417 33859 validation.cpp:432] Executor default for
> task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 uses less memory (None) than the
> minimum required (32MB). Please update your executor, as this will be
> mandatory in future releases.
> DEBUG: I1126 10:07:40.315511 33859 master.hpp:176] Adding task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 with resources cpus(*):1; mem(*):32 on
> slave dab6e888-2e40-45e4-9595-051febe7dc16-S0 (x.com)
> DEBUG: I1126 10:07:40.315538 33859 master.cpp:3264] Launching task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000 (default) at
> [email protected]:33492 with resources
> cpus(*):1; mem(*):32 on slave dab6e888-2e40-45e4-9595-051febe7dc16-S0 at
> slave(313)@a.b.c.d:33492 (x.com)
> DEBUG: I1126 10:07:40.315716 33842 slave.cpp:1292] Got assigned task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 for framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:40.315937 33842 slave.cpp:1411] Launching task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 for framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:40.315933 33850 hierarchical.cpp:793] Recovered cpus(*):1;
> mem(*):992; disk(*):1024; ports(*):[31000-32000] (total: cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):1;
> mem(*):32) on slave dab6e888-2e40-45e4-9595-051febe7dc16-S0 from framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:40.994328 33842 paths.cpp:434] Trying to chown
> '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/slaves/dab6e888-2e40-45e4-9595-051febe7dc16-S0/frameworks/dab6e888-2e40-45e4-9595-051febe7dc16-0000/executors/default/runs/27d66117-c4df-4da2-b1fc-e1f473c96b67'
> to user 'mockbuild'
> DEBUG: I1126 10:07:41.002754 33842 slave.cpp:5043] Launching executor default
> of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 with resources in
> work directory
> '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/slaves/dab6e888-2e40-45e4-9595-051febe7dc16-S0/frameworks/dab6e888-2e40-45e4-9595-051febe7dc16-0000/executors/default/runs/27d66117-c4df-4da2-b1fc-e1f473c96b67'
> DEBUG: I1126 10:07:41.002931 33842 slave.cpp:1629] Queuing task
> '6669df6d-7a2c-415f-a2c1-f2615ec6d7f6' for executor 'default' of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: E1126 10:07:41.004542 33842 slave.cpp:3468] Container
> '27d66117-c4df-4da2-b1fc-e1f473c96b67' for executor 'default' of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000 failed to start: Injected failure
> DEBUG: E1126 10:07:41.004598 33842 slave.cpp:3565] Termination of executor
> 'default' of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 failed:
> Unknown container: 27d66117-c4df-4da2-b1fc-e1f473c96b67
> DEBUG: I1126 10:07:41.004650 33842 slave.cpp:2763] Handling status update
> TASK_FAILED (UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847) for task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000 from @0.0.0.0:0
> DEBUG: I1126 10:07:41.004907 33842 master.cpp:4535] Executor 'default' of
> framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 on slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0 at slave(313)@a.b.c.d:33492 (x.com):
> terminated with signal Unknown signal 127
> DEBUG: I1126 10:07:41.004919 33850 status_update_manager.cpp:320] Received
> status update TASK_FAILED (UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847) for
> task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.004937 33842 master.cpp:6180] Removing executor
> 'default' with resources of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000 on slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0 at slave(313)@a.b.c.d:33492 (x.com)
> DEBUG: I1126 10:07:41.005262 33861 slave.cpp:3115] Forwarding the update
> TASK_FAILED (UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847) for task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000 to [email protected]:33492
> DEBUG: I1126 10:07:41.005549 33860 master.cpp:4433] Status update TASK_FAILED
> (UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847) for task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000 from slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0 at slave(313)@a.b.c.d:33492 (x.com)
> DEBUG: I1126 10:07:41.005584 33860 master.cpp:4481] Forwarding status update
> TASK_FAILED (UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847) for task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.005736 33860 master.cpp:6085] Updating the state of
> task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000 (latest state: TASK_FAILED, status
> update state: TASK_FAILED)
> DEBUG: I1126 10:07:41.005898 33846 hierarchical.cpp:793] Recovered cpus(*):1;
> mem(*):32 (total: cpus(*):2; mem(*):1024; disk(*):1024;
> ports(*):[31000-32000], allocated: ) on slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0 from framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.005977 33860 master.cpp:3590] Processing ACKNOWLEDGE
> call 3873b597-6af7-4b97-9ad2-4e0f285ae847 for task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000 (default) at
> [email protected]:33492 on slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0
> DEBUG: I1126 10:07:41.006007 33860 master.cpp:6151] Removing task
> 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 with resources cpus(*):1; mem(*):32 of
> framework dab6e888-2e40-45e4-9595-051febe7dc16-0000 on slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0 at slave(313)@a.b.c.d:33492 (x.com)
> DEBUG: I1126 10:07:41.006250 33855 status_update_manager.cpp:392] Received
> status update acknowledgement (UUID: 3873b597-6af7-4b97-9ad2-4e0f285ae847)
> for task 6669df6d-7a2c-415f-a2c1-f2615ec6d7f6 of framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.006495 33846 slave.cpp:3685] Cleaning up executor
> 'default' of framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: tests/slave_tests.cpp:1150: Failure
> DEBUG: Value of: snapshot.values["slave/container_launch_errors"]
> DEBUG: Actual: 16-byte object <02-00 00-00 20-74 6F-20 00-49 27-03 00-00
> 00-00>
> DEBUG: Expected: 1
> DEBUG: I1126 10:07:41.006705 33846 slave.cpp:3773] Cleaning up framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.006731 33863 gc.cpp:54] Scheduling
> '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/slaves/dab6e888-2e40-45e4-9595-051febe7dc16-S0/frameworks/dab6e888-2e40-45e4-9595-051febe7dc16-0000/executors/default/runs/27d66117-c4df-4da2-b1fc-e1f473c96b67'
> for gc 6.99999992410963days in the future
> DEBUG: I1126 10:07:41.006810 33863 gc.cpp:54] Scheduling
> '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/slaves/dab6e888-2e40-45e4-9595-051febe7dc16-S0/frameworks/dab6e888-2e40-45e4-9595-051febe7dc16-0000/executors/default'
> for gc 6.99999992252148days in the future
> DEBUG: I1126 10:07:41.006824 33843 status_update_manager.cpp:282] Closing
> status update streams for framework dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.006850 33863 gc.cpp:54] Scheduling
> '/tmp/SlaveTest_MetricsSlaveLaunchErrors_N8N8x9/slaves/dab6e888-2e40-45e4-9595-051febe7dc16-S0/frameworks/dab6e888-2e40-45e4-9595-051febe7dc16-0000'
> for gc 6.99999992122667days in the future
> DEBUG: I1126 10:07:41.007093 33823 sched.cpp:1803] Asked to stop the driver
> DEBUG: I1126 10:07:41.007164 33847 sched.cpp:1041] Stopping framework
> 'dab6e888-2e40-45e4-9595-051febe7dc16-0000'
> DEBUG: I1126 10:07:41.007185 33861 master.cpp:926] Master terminating
> DEBUG: I1126 10:07:41.007392 33845 hierarchical.cpp:410] Removed slave
> dab6e888-2e40-45e4-9595-051febe7dc16-S0
> DEBUG: I1126 10:07:41.007519 33854 hierarchical.cpp:260] Removed framework
> dab6e888-2e40-45e4-9595-051febe7dc16-0000
> DEBUG: I1126 10:07:41.007750 33863 slave.cpp:3243] [email protected]:33492 exited
> DEBUG: W1126 10:07:41.007767 33863 slave.cpp:3246] Master disconnected!
> Waiting for a new master to be elected
> DEBUG: 2015-11-26
> 10:07:41,933:33823(0x7f5e2a7fc700):ZOO_ERROR@handle_socket_error_msg@1697:
> Socket [127.0.0.1:57854] zk retcode=-4, errno=111(Connection refused): server
> refused to accept the client
> DEBUG: I1126 10:07:41.959815 33844 leveldb.cpp:341] Persisting action (16
> bytes) to leveldb took 1.788383711secs
> DEBUG: I1126 10:07:41.959870 33844 replica.cpp:713] Persisted action at 4
> DEBUG: I1126 10:07:41.960736 33823 slave.cpp:599] Slave terminating
> DEBUG: [ FAILED ] SlaveTest.MetricsSlaveLaunchErrors (11967 ms)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)