[
https://issues.apache.org/jira/browse/MESOS-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394409#comment-15394409
]
Joseph Wu commented on MESOS-5907:
----------------------------------
Also observed this infinite loop on the ASF CI because the agent took more than
2 seconds to fail:
{code}
[ RUN ] ExamplesTest.DiskFullFramework
Using temporary directory '/tmp/ExamplesTest_DiskFullFramework_lBhMQ9'
/mesos/mesos-1.1.0/src/tests/disk_full_framework_test.sh: line 8:
/mesos/mesos-1.1.0/_build/src/colors.sh: No such file or directory
/mesos/mesos-1.1.0/src/tests/disk_full_framework_test.sh: line 9:
/mesos/mesos-1.1.0/_build/src/atexit.sh: No such file or directory
Launched master at 451
Launched agent at 469
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0726 14:53:44.130648 451 main.cpp:263] Build: 2016-07-26 14:31:53 by
I0726 14:53:44.130833 451 main.cpp:264] Version: 1.1.0
I0726 14:53:44.134968 451 process.cpp:1071] libprocess is initialized on
127.0.0.1:5432 with 16 worker threads
I0726 14:53:44.135056 451 logging.cpp:199] Logging to STDERR
I0726 14:53:44.136104 451 main.cpp:370] Using 'HierarchicalDRF' allocator
I0726 14:53:44.139860 451 leveldb.cpp:174] Opened db in 3.492501ms
I0726 14:53:44.141086 451 leveldb.cpp:181] Compacted db in 1.171278ms
I0726 14:53:44.141196 451 leveldb.cpp:196] Created db iterator in 74126ns
I0726 14:53:44.141222 451 leveldb.cpp:202] Seeked to beginning of db in 4323ns
I0726 14:53:44.141232 451 leveldb.cpp:271] Iterated through 0 keys in the db
in 356ns
I0726 14:53:44.141439 451 replica.cpp:779] Replica recovered with log
positions 0 -> 0 with 1 holes and 0 unlearned
I0726 14:53:44.144086 496 recover.cpp:451] Starting replica recovery
I0726 14:53:44.144966 496 recover.cpp:477] Replica is in EMPTY status
I0726 14:53:44.145054 451 main.cpp:484] Creating default 'local' authorizer
I0726 14:53:44.145643 451 main.cpp:543] Starting Mesos master
I0726 14:53:44.147657 493 replica.cpp:673] Replica in EMPTY status received a
broadcasted recover request from (3)@127.0.0.1:5432
I0726 14:53:44.148839 493 recover.cpp:197] Received a recover response from a
replica in EMPTY status
I0726 14:53:44.149585 494 recover.cpp:568] Updating replica status to STARTING
I0726 14:53:44.150604 491 leveldb.cpp:304] Persisting metadata (8 bytes) to
leveldb took 825875ns
I0726 14:53:44.150635 491 replica.cpp:320] Persisted replica status to
STARTING
I0726 14:53:44.150743 451 master.cpp:375] Master
c2149f99-5702-44d7-8e17-040e3e228403 (localhost) started on 127.0.0.1:5432
I0726 14:53:44.151023 490 recover.cpp:477] Replica is in STARTING status
I0726 14:53:44.150774 451 master.cpp:377] Flags at startup:
--acls="permissive: false
register_frameworks {
principals {
type: SOME
values: "test-principal"
}
roles {
type: SOME
values: "*"
}
}
run_tasks {
principals {
type: SOME
values: "test-principal"
}
users {
type: SOME
values: "mesos"
}
}
" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins"
--allocation_interval="1secs" --allocator="HierarchicalDRF"
--authenticate_agents="false" --authenticate_frameworks="true"
--authenticate_http_frameworks="false" --authenticate_http_readonly="false"
--authenticate_http_readwrite="false" --authenticators="crammd5"
--authorizers="local"
--credentials="/tmp/ExamplesTest_DiskFullFramework_lBhMQ9/credentials"
--framework_sorter="drf" --help="false" --hostname_lookup="true"
--http_authenticators="basic" --initialize_driver_logging="true"
--ip="127.0.0.1" --log_auto_initialize="true" --logbufsecs="0"
--logging_level="INFO" --max_agent_ping_timeouts="5"
--max_completed_frameworks="50" --max_completed_tasks_per_framework="1000"
--port="5432" --quiet="false" --recovery_agent_removal_limit="100%"
--registry="replicated_log" --registry_fetch_timeout="1mins"
--registry_store_timeout="20secs" --registry_strict="false"
--root_submissions="true" --user_sorter="drf" --version="false"
--webui_dir="/mesos/mesos-1.1.0/src/webui" --work_dir="/tmp/mesos-bXDn8B"
--zk_session_timeout="10secs"
W0726 14:53:44.151664 451 master.cpp:380]
**************************************************
Master bound to loopback interface! Cannot communicate with remote schedulers
or agents. You might want to set '--ip' flag to a routable IP address.
**************************************************
I0726 14:53:44.151708 451 master.cpp:427] Master only allowing authenticated
frameworks to register
I0726 14:53:44.151717 451 master.cpp:443] Master allowing unauthenticated
agents to register
I0726 14:53:44.151722 451 master.cpp:457] Master allowing HTTP frameworks to
register without authentication
I0726 14:53:44.151731 451 credentials.hpp:37] Loading credentials for
authentication from '/tmp/ExamplesTest_DiskFullFramework_lBhMQ9/credentials'
W0726 14:53:44.151836 451 credentials.hpp:52] Permissions on credentials file
'/tmp/ExamplesTest_DiskFullFramework_lBhMQ9/credentials' are too open; it is
recommended that your credentials file is NOT accessible by others
I0726 14:53:44.152092 451 master.cpp:499] Using default 'crammd5'
authenticator
I0726 14:53:44.152335 451 authenticator.cpp:519] Initializing server SASL
I0726 14:53:44.152325 495 replica.cpp:673] Replica in STARTING status
received a broadcasted recover request from (4)@127.0.0.1:5432
I0726 14:53:44.152746 489 recover.cpp:197] Received a recover response from a
replica in STARTING status
I0726 14:53:44.153059 451 auxprop.cpp:73] Initialized in-memory auxiliary
property plugin
I0726 14:53:44.153189 451 master.cpp:579] Authorization enabled
I0726 14:53:44.153281 499 recover.cpp:568] Updating replica status to VOTING
I0726 14:53:44.153734 497 whitelist_watcher.cpp:77] No whitelist given
I0726 14:53:44.153813 501 hierarchical.cpp:151] Initialized hierarchical
allocator process
I0726 14:53:44.153937 493 leveldb.cpp:304] Persisting metadata (8 bytes) to
leveldb took 430286ns
I0726 14:53:44.153977 493 replica.cpp:320] Persisted replica status to VOTING
I0726 14:53:44.154110 503 recover.cpp:582] Successfully joined the Paxos group
I0726 14:53:44.154350 503 recover.cpp:466] Recover process terminated
I0726 14:53:44.171664 492 master.cpp:1847] The newly elected leader is
[email protected]:5432 with id c2149f99-5702-44d7-8e17-040e3e228403
I0726 14:53:44.171751 492 master.cpp:1860] Elected as the leading master!
I0726 14:53:44.171789 492 master.cpp:1547] Recovering from registrar
I0726 14:53:44.172025 498 registrar.cpp:332] Recovering registrar
I0726 14:53:44.173367 499 log.cpp:553] Attempting to start the writer
I0726 14:53:44.176110 489 replica.cpp:493] Replica received implicit promise
request from (5)@127.0.0.1:5432 with proposal 1
I0726 14:53:44.176636 489 leveldb.cpp:304] Persisting metadata (8 bytes) to
leveldb took 472618ns
I0726 14:53:44.176662 489 replica.cpp:342] Persisted promised to 1
I0726 14:53:44.177786 489 coordinator.cpp:238] Coordinator attempting to fill
missing positions
I0726 14:53:44.179344 503 replica.cpp:388] Replica received explicit promise
request from (6)@127.0.0.1:5432 for position 0 with proposal 2
I0726 14:53:44.179770 503 leveldb.cpp:341] Persisting action (8 bytes) to
leveldb took 326955ns
I0726 14:53:44.179795 503 replica.cpp:712] Persisted action at 0
I0726 14:53:44.181021 496 replica.cpp:537] Replica received write request for
position 0 from (7)@127.0.0.1:5432
I0726 14:53:44.181149 496 leveldb.cpp:436] Reading position from leveldb took
89421ns
I0726 14:53:44.181532 496 leveldb.cpp:341] Persisting action (14 bytes) to
leveldb took 308801ns
I0726 14:53:44.181555 496 replica.cpp:712] Persisted action at 0
I0726 14:53:44.182224 490 replica.cpp:691] Replica received learned notice
for position 0 from @0.0.0.0:0
I0726 14:53:44.182595 490 leveldb.cpp:341] Persisting action (16 bytes) to
leveldb took 301273ns
I0726 14:53:44.182616 490 replica.cpp:712] Persisted action at 0
I0726 14:53:44.182636 490 replica.cpp:697] Replica learned NOP action at
position 0
I0726 14:53:44.183385 504 log.cpp:569] Writer started with ending position 0
I0726 14:53:44.185390 503 leveldb.cpp:436] Reading position from leveldb took
30503ns
I0726 14:53:44.188508 503 registrar.cpp:365] Successfully fetched the
registry (0B) in 16.417024ms
I0726 14:53:44.188736 503 registrar.cpp:464] Applied 1 operations in 51155ns;
attempting to update the 'registry'
I0726 14:53:44.191341 503 log.cpp:577] Attempting to append 156 bytes to the
log
I0726 14:53:44.191520 489 coordinator.cpp:348] Coordinator attempting to
write APPEND action at position 1
I0726 14:53:44.192415 491 replica.cpp:537] Replica received write request for
position 1 from (8)@127.0.0.1:5432
I0726 14:53:44.192991 491 leveldb.cpp:341] Persisting action (175 bytes) to
leveldb took 498290ns
I0726 14:53:44.193022 491 replica.cpp:712] Persisted action at 1
I0726 14:53:44.194257 498 replica.cpp:691] Replica received learned notice
for position 1 from @0.0.0.0:0
I0726 14:53:44.194766 498 leveldb.cpp:341] Persisting action (177 bytes) to
leveldb took 437138ns
I0726 14:53:44.194793 498 replica.cpp:712] Persisted action at 1
I0726 14:53:44.194823 498 replica.cpp:697] Replica learned APPEND action at
position 1
I0726 14:53:44.196669 495 registrar.cpp:509] Successfully updated the
'registry' in 7.785984ms
I0726 14:53:44.196846 495 registrar.cpp:395] Successfully recovered registrar
I0726 14:53:44.197120 492 log.cpp:596] Attempting to truncate the log to 1
I0726 14:53:44.197366 500 coordinator.cpp:348] Coordinator attempting to
write TRUNCATE action at position 2
I0726 14:53:44.197446 504 master.cpp:1655] Recovered 0 agents from the
Registry (118B) ; allowing 10mins for agents to re-register
I0726 14:53:44.197480 496 hierarchical.cpp:178] Skipping recovery of
hierarchical allocator: nothing to recover
I0726 14:53:44.198210 504 replica.cpp:537] Replica received write request for
position 2 from (9)@127.0.0.1:5432
I0726 14:53:44.198786 504 leveldb.cpp:341] Persisting action (16 bytes) to
leveldb took 529434ns
I0726 14:53:44.198809 504 replica.cpp:712] Persisted action at 2
I0726 14:53:44.199458 498 replica.cpp:691] Replica received learned notice
for position 2 from @0.0.0.0:0
I0726 14:53:44.199968 498 leveldb.cpp:341] Persisting action (18 bytes) to
leveldb took 467596ns
I0726 14:53:44.200031 498 leveldb.cpp:399] Deleting ~1 keys from leveldb took
39189ns
I0726 14:53:44.200052 498 replica.cpp:712] Persisted action at 2
I0726 14:53:44.200084 498 replica.cpp:697] Replica learned TRUNCATE action at
position 2
I0726 14:53:45.155596 499 hierarchical.cpp:1537] No allocations performed
I0726 14:53:45.155730 499 hierarchical.cpp:1172] Performed allocation for 0
agents in 353342ns
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0726 14:53:46.147034 469 main.cpp:243] Build: 2016-07-26 14:31:53 by
I0726 14:53:46.147214 469 main.cpp:244] Version: 1.1.0
I0726 14:53:46.151357 469 process.cpp:1071] libprocess is initialized on
172.17.0.2:5051 with 16 worker threads
I0726 14:53:46.151448 469 logging.cpp:199] Logging to STDERR
I0726 14:53:46.157204 493 hierarchical.cpp:1537] No allocations performed
I0726 14:53:46.157335 493 hierarchical.cpp:1172] Performed allocation for 0
agents in 215495ns
I0726 14:53:46.418125 469 systemd.cpp:237] systemd version `219` detected
I0726 14:53:46.418192 469 main.cpp:342] Inializing systemd state
Failed to initialize systemd: Failed to locate systemd runtime directory:
/run/systemd/system
I0726 14:53:47.158877 502 hierarchical.cpp:1537] No allocations performed
I0726 14:53:47.159018 502 hierarchical.cpp:1172] Performed allocation for 0
agents in 243814ns
I0726 14:53:48.160492 490 hierarchical.cpp:1537] No allocations performed
I0726 14:53:48.160564 490 hierarchical.cpp:1172] Performed allocation for 0
agents in 123129ns
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0726 14:53:48.266243 507 process.cpp:1071] libprocess is initialized on
172.17.0.2:56935 with 16 worker threads
I0726 14:53:48.266592 507 disk_full_framework.cpp:436] Enabling
authentication for the framework
I0726 14:53:48.268671 507 logging.cpp:199] Logging to STDERR
I0726 14:53:48.270119 507 sched.cpp:226] Version: 1.1.0
I0726 14:53:48.283542 559 sched.cpp:330] New master detected at
[email protected]:5432
I0726 14:53:48.285239 559 sched.cpp:396] Authenticating with master
[email protected]:5432
I0726 14:53:48.285272 559 sched.cpp:403] Using default CRAM-MD5 authenticatee
I0726 14:53:48.286123 555 authenticatee.cpp:97] Initializing client SASL
I0726 14:53:48.286865 555 authenticatee.cpp:121] Creating new client SASL
connection
I0726 14:53:48.288255 494 master.cpp:5883] Authenticating
[email protected]:56935
I0726 14:53:48.288504 492 authenticator.cpp:414] Starting authentication
session for crammd5_authenticatee(1)@172.17.0.2:56935
I0726 14:53:48.289562 496 authenticator.cpp:98] Creating new server SASL
connection
I0726 14:53:48.290738 551 authenticatee.cpp:213] Received SASL authentication
mechanisms: CRAM-MD5
I0726 14:53:48.290882 551 authenticatee.cpp:239] Attempting to authenticate
with mechanism 'CRAM-MD5'
I0726 14:53:48.291503 493 authenticator.cpp:204] Received SASL authentication
start
I0726 14:53:48.291597 493 authenticator.cpp:326] Authentication requires more
steps
I0726 14:53:48.292167 546 authenticatee.cpp:259] Received SASL authentication
step
I0726 14:53:48.292670 491 authenticator.cpp:232] Received SASL authentication
step
I0726 14:53:48.292727 491 auxprop.cpp:109] Request to lookup properties for
user: 'test-principal' realm: '20ee489ad4ce' server FQDN: '20ee489ad4ce'
SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
SASL_AUXPROP_AUTHZID: false
I0726 14:53:48.292760 491 auxprop.cpp:181] Looking up auxiliary property
'*userPassword'
I0726 14:53:48.292896 491 auxprop.cpp:181] Looking up auxiliary property
'*cmusaslsecretCRAM-MD5'
I0726 14:53:48.292948 491 auxprop.cpp:109] Request to lookup properties for
user: 'test-principal' realm: '20ee489ad4ce' server FQDN: '20ee489ad4ce'
SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
SASL_AUXPROP_AUTHZID: true
I0726 14:53:48.292985 491 auxprop.cpp:131] Skipping auxiliary property
'*userPassword' since SASL_AUXPROP_AUTHZID == true
I0726 14:53:48.292995 491 auxprop.cpp:131] Skipping auxiliary property
'*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0726 14:53:48.293032 491 authenticator.cpp:318] Authentication success
I0726 14:53:48.293509 549 authenticatee.cpp:299] Authentication success
I0726 14:53:48.293598 500 master.cpp:5913] Successfully authenticated
principal 'test-principal' at
[email protected]:56935
I0726 14:53:48.293686 493 authenticator.cpp:432] Authentication session
cleanup for crammd5_authenticatee(1)@172.17.0.2:56935
I0726 14:53:48.294248 554 sched.cpp:502] Successfully authenticated with
master [email protected]:5432
I0726 14:53:48.294280 554 sched.cpp:820] Sending SUBSCRIBE call to
[email protected]:5432
I0726 14:53:48.294695 554 sched.cpp:853] Will retry registration in
1.314239818secs if necessary
I0726 14:53:48.295074 490 master.cpp:2424] Received SUBSCRIBE call for
framework 'Disk Full Framework (C++)' at
[email protected]:56935
I0726 14:53:48.295230 490 master.cpp:1886] Authorizing framework principal
'test-principal' to receive offers for role '*'
I0726 14:53:48.296437 502 master.cpp:2500] Subscribing framework Disk Full
Framework (C++) with checkpointing enabled and capabilities [ ]
I0726 14:53:48.298159 496 hierarchical.cpp:271] Added framework
c2149f99-5702-44d7-8e17-040e3e228403-0000
I0726 14:53:48.298214 561 sched.cpp:743] Framework registered with
c2149f99-5702-44d7-8e17-040e3e228403-0000
I0726 14:53:48.298264 561 disk_full_framework.cpp:328] Registered with
framework ID: c2149f99-5702-44d7-8e17-040e3e228403-0000
I0726 14:53:48.298279 496 hierarchical.cpp:1537] No allocations performed
I0726 14:53:48.298318 561 sched.cpp:757] Scheduler::registered took 50836ns
I0726 14:53:48.298378 496 hierarchical.cpp:1632] No inverse offers to send
out!
I0726 14:53:48.298486 496 hierarchical.cpp:1172] Performed allocation for 0
agents in 290704ns
I0726 14:53:49.161883 499 hierarchical.cpp:1537] No allocations performed
I0726 14:53:49.161928 499 hierarchical.cpp:1632] No inverse offers to send
out!
I0726 14:53:49.162045 499 hierarchical.cpp:1172] Performed allocation for 0
agents in 203080ns
I0726 14:53:50.163571 495 hierarchical.cpp:1537] No allocations performed
I0726 14:53:50.163611 495 hierarchical.cpp:1632] No inverse offers to send
out!
...
{code}
> ExamplesTest.DiskFullFramework fails on Arch
> --------------------------------------------
>
> Key: MESOS-5907
> URL: https://issues.apache.org/jira/browse/MESOS-5907
> Project: Mesos
> Issue Type: Bug
> Components: tests
> Environment: Linux archlinux.vagrant.vm 4.6.4-1-ARCH #1 SMP PREEMPT
> Mon Jul 11 19:12:32 CEST 2016 x86_64 GNU/Linux
> Reporter: Neil Conway
> Assignee: Joseph Wu
> Labels: mesosphere
> Attachments: disk_full_framework.log
>
>
> This test fails consistently on recent Arch linux, running in a VM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)