[
https://issues.apache.org/jira/browse/MESOS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14984970#comment-14984970
]
Matthias Veit commented on MESOS-3793:
--------------------------------------
Starting mesos local with --launcher=posix has no effect.
With env variable export MESOS_LAUNCHER=posix I can start mesos local.
Mounting /sys/fs/cgroup and starting mesos local fails with this error:
{noformat}
➔ docker run -v /sys/fs/cgroup:/sys/fs/cgroup:rw -it marathon-buildbase:test sh
# mesos local
I1102 09:35:15.839287 5 leveldb.cpp:176] Opened db in 4.975612ms
I1102 09:35:15.840312 5 leveldb.cpp:183] Compacted db in 981189ns
I1102 09:35:15.840348 5 leveldb.cpp:198] Created db iterator in 9033ns
I1102 09:35:15.840353 5 leveldb.cpp:204] Seeked to beginning of db in 1414ns
I1102 09:35:15.840358 5 leveldb.cpp:273] Iterated through 0 keys in the db
in 1025ns
I1102 09:35:15.840389 5 replica.cpp:744] Replica recovered with log
positions 0 -> 0 with 1 holes and 0 unlearned
I1102 09:35:15.840790 9 recover.cpp:449] Starting replica recovery
I1102 09:35:15.840991 10 recover.cpp:475] Replica is in EMPTY status
I1102 09:35:15.841492 9 replica.cpp:641] Replica in EMPTY status received a
broadcasted recover request
I1102 09:35:15.841908 6 recover.cpp:195] Received a recover response from a
replica in EMPTY status
I1102 09:35:15.842003 6 recover.cpp:566] Updating replica status to STARTING
I1102 09:35:15.843122 7 master.cpp:376] Master
af8c1547-e308-4348-99d4-93879f06d853 (833b280a4c4a) started on 172.17.0.7:5050
I1102 09:35:15.843327 7 master.cpp:378] Flags at startup:
--allocation_interval="1secs" --allocator="HierarchicalDRF"
--authenticate="false" --authenticate_slaves="false" --authenticators="crammd5"
--authorizers="local" --framework_sorter="drf" --help="false"
--hostname_lookup="true" --initialize_driver_logging="true"
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO"
--max_slave_ping_timeouts="5" --quiet="false"
--recovery_slave_removal_limit="100%" --registry="replicated_log"
--registry_fetch_timeout="1mins" --registry_store_timeout="5secs"
--registry_strict="false" --root_submissions="true"
--slave_ping_timeout="15secs" --slave_reregister_timeout="10mins"
--user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui"
--work_dir="/tmp/mesos/local/JU6SZj" --zk_session_timeout="10secs"
I1102 09:35:15.843575 7 master.cpp:425] Master allowing unauthenticated
frameworks to register
I1102 09:35:15.843822 7 master.cpp:430] Master allowing unauthenticated
slaves to register
I1102 09:35:15.843950 7 master.cpp:467] Using default 'crammd5'
authenticator
W1102 09:35:15.844105 7 authenticator.cpp:505] No credentials provided,
authentication requests will be refused
I1102 09:35:15.844224 7 authenticator.cpp:512] Initializing server SASL
I1102 09:35:15.843875 5 containerizer.cpp:143] Using isolation:
posix/cpu,posix/mem,filesystem/posix
I1102 09:35:15.843231 11 leveldb.cpp:306] Persisting metadata (8 bytes) to
leveldb took 1.186846ms
I1102 09:35:15.844820 11 replica.cpp:323] Persisted replica status to
STARTING
I1102 09:35:15.845212 11 recover.cpp:475] Replica is in STARTING status
I1102 09:35:15.845577 11 replica.cpp:641] Replica in STARTING status
received a broadcasted recover request
I1102 09:35:15.845881 11 recover.cpp:195] Received a recover response from a
replica in STARTING status
I1102 09:35:15.846217 11 recover.cpp:566] Updating replica status to VOTING
I1102 09:35:15.846650 11 leveldb.cpp:306] Persisting metadata (8 bytes) to
leveldb took 265224ns
I1102 09:35:15.846683 11 replica.cpp:323] Persisted replica status to VOTING
I1102 09:35:15.846721 11 recover.cpp:580] Successfully joined the Paxos group
I1102 09:35:15.846835 11 recover.cpp:464] Recover process terminated
I1102 09:35:15.849839 7 master.cpp:1603] The newly elected leader is
[email protected]:5050 with id af8c1547-e308-4348-99d4-93879f06d853
I1102 09:35:15.853528 7 master.cpp:1616] Elected as the leading master!
I1102 09:35:15.853793 7 master.cpp:1376] Recovering from registrar
I1102 09:35:15.854033 13 registrar.cpp:309] Recovering registrar
I1102 09:35:15.854266 9 log.cpp:661] Attempting to start the writer
I1102 09:35:15.854802 9 replica.cpp:477] Replica received implicit promise
request with proposal 1
I1102 09:35:15.853359 5 linux_launcher.cpp:103] Using
/sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I1102 09:35:15.856086 9 leveldb.cpp:306] Persisting metadata (8 bytes) to
leveldb took 1.148617ms
I1102 09:35:15.856168 9 replica.cpp:345] Persisted promised to 1
I1102 09:35:15.857818 6 coordinator.cpp:231] Coordinator attemping to fill
missing position
I1102 09:35:15.858723 13 replica.cpp:378] Replica received explicit promise
request for position 0 with proposal 2
I1102 09:35:15.859380 13 leveldb.cpp:343] Persisting action (8 bytes) to
leveldb took 599989ns
I1102 09:35:15.859414 13 replica.cpp:679] Persisted action at 0
I1102 09:35:15.859788 9 replica.cpp:511] Replica received write request for
position 0
I1102 09:35:15.859863 9 leveldb.cpp:438] Reading position from leveldb took
16229ns
I1102 09:35:15.860203 9 leveldb.cpp:343] Persisting action (14 bytes) to
leveldb took 317011ns
I1102 09:35:15.860257 9 replica.cpp:679] Persisted action at 0
I1102 09:35:15.860366 9 replica.cpp:658] Replica received learned notice
for position 0
I1102 09:35:15.861297 9 leveldb.cpp:343] Persisting action (16 bytes) to
leveldb took 789105ns
I1102 09:35:15.861330 9 replica.cpp:679] Persisted action at 0
I1102 09:35:15.861371 9 replica.cpp:664] Replica learned NOP action at
position 0
I1102 09:35:15.861457 9 log.cpp:677] Writer started with ending position 0
I1102 09:35:15.861711 9 leveldb.cpp:438] Reading position from leveldb took
7791ns
I1102 09:35:15.862535 9 registrar.cpp:342] Successfully fetched the
registry (0B) in 8.40192ms
I1102 09:35:15.862589 9 registrar.cpp:441] Applied 1 operations in 4352ns;
attempting to update the 'registry'
I1102 09:35:15.862763 9 log.cpp:685] Attempting to append 165 bytes to the
log
I1102 09:35:15.862846 9 coordinator.cpp:341] Coordinator attempting to
write APPEND action at position 1
I1102 09:35:15.863004 9 replica.cpp:511] Replica received write request for
position 1
I1102 09:35:15.863351 9 leveldb.cpp:343] Persisting action (184 bytes) to
leveldb took 282975ns
I1102 09:35:15.863426 9 replica.cpp:679] Persisted action at 1
I1102 09:35:15.863567 10 replica.cpp:658] Replica received learned notice
for position 1
I1102 09:35:15.863859 10 leveldb.cpp:343] Persisting action (186 bytes) to
leveldb took 267957ns
I1102 09:35:15.863886 10 replica.cpp:679] Persisted action at 1
I1102 09:35:15.863898 10 replica.cpp:664] Replica learned APPEND action at
position 1
I1102 09:35:15.864140 9 registrar.cpp:486] Successfully updated the
'registry' in 1.516032ms
I1102 09:35:15.864183 10 log.cpp:704] Attempting to truncate the log to 1
I1102 09:35:15.864302 10 coordinator.cpp:341] Coordinator attempting to
write TRUNCATE action at position 2
I1102 09:35:15.864197 9 registrar.cpp:372] Successfully recovered registrar
I1102 09:35:15.864425 9 replica.cpp:511] Replica received write request for
position 2
I1102 09:35:15.864423 10 master.cpp:1413] Recovered 0 slaves from the
Registry (127B) ; allowing 10mins for slaves to re-register
I1102 09:35:15.866138 9 leveldb.cpp:343] Persisting action (16 bytes) to
leveldb took 1.676671ms
I1102 09:35:15.866181 9 replica.cpp:679] Persisted action at 2
I1102 09:35:15.866294 9 replica.cpp:658] Replica received learned notice
for position 2
I1102 09:35:15.866595 5 systemd.cpp:128] systemd version `215` detected
W1102 09:35:15.866622 5 systemd.cpp:136] Required functionality `Delegate`
was introduced in Version `218`. Your system may not function properly; however
since some distributions have patched systemd packages, your system may still
be functional. This is why we keep running. See MESOS-3352 for more information
I1102 09:35:15.866664 9 leveldb.cpp:343] Persisting action (18 bytes) to
leveldb took 294030ns
I1102 09:35:15.866722 9 leveldb.cpp:401] Deleting ~1 keys from leveldb took
13730ns
I1102 09:35:15.866750 9 replica.cpp:679] Persisted action at 2
I1102 09:35:15.866780 9 replica.cpp:664] Replica learned TRUNCATE action at
position 2
Failed to create a containerizer: Could not create MesosContainerizer: Failed
to create launcher: Failed to initialize systemd: Failed to locate systemd
runtime directory: /run/systemd/system
{noformat}
> Cannot start mesos local on a Debian GNU/Linux 8 docker machine
> ---------------------------------------------------------------
>
> Key: MESOS-3793
> URL: https://issues.apache.org/jira/browse/MESOS-3793
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 0.25.0
> Environment: Debian GNU/Linux 8 docker machine
> Reporter: Matthias Veit
> Assignee: Jojy Varghese
> Labels: mesosphere
>
> We updated the mesos version to 0.25.0 in our Marathon docker image, that
> runs our integration tests.
> We use mesos local for those tests. This fails with this message:
> {noformat}
> root@a06e4b4eb776:/marathon# mesos local
> I1022 18:42:26.852485 136 leveldb.cpp:176] Opened db in 6.103258ms
> I1022 18:42:26.853302 136 leveldb.cpp:183] Compacted db in 765740ns
> I1022 18:42:26.853343 136 leveldb.cpp:198] Created db iterator in 9001ns
> I1022 18:42:26.853355 136 leveldb.cpp:204] Seeked to beginning of db in
> 1287ns
> I1022 18:42:26.853366 136 leveldb.cpp:273] Iterated through 0 keys in the
> db in 1111ns
> I1022 18:42:26.853406 136 replica.cpp:744] Replica recovered with log
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1022 18:42:26.853775 141 recover.cpp:449] Starting replica recovery
> I1022 18:42:26.853862 141 recover.cpp:475] Replica is in EMPTY status
> I1022 18:42:26.854751 138 replica.cpp:641] Replica in EMPTY status received
> a broadcasted recover request
> I1022 18:42:26.854856 140 recover.cpp:195] Received a recover response from
> a replica in EMPTY status
> I1022 18:42:26.855002 140 recover.cpp:566] Updating replica status to
> STARTING
> I1022 18:42:26.855655 138 master.cpp:376] Master
> a3f39818-1bda-4710-b96b-2a60ed4d12b8 (a06e4b4eb776) started on
> 172.17.0.14:5050
> I1022 18:42:26.855680 138 master.cpp:378] Flags at startup:
> --allocation_interval="1secs" --allocator="HierarchicalDRF"
> --authenticate="false" --authenticate_slaves="false"
> --authenticators="crammd5" --authorizers="local" --framework_sorter="drf"
> --help="false" --hostname_lookup="true" --initialize_driver_logging="true"
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO"
> --max_slave_ping_timeouts="5" --quiet="false"
> --recovery_slave_removal_limit="100%" --registry="replicated_log"
> --registry_fetch_timeout="1mins" --registry_store_timeout="5secs"
> --registry_strict="false" --root_submissions="true"
> --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins"
> --user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui"
> --work_dir="/tmp/mesos/local/AK0XpG" --zk_session_timeout="10secs"
> I1022 18:42:26.855790 138 master.cpp:425] Master allowing unauthenticated
> frameworks to register
> I1022 18:42:26.855803 138 master.cpp:430] Master allowing unauthenticated
> slaves to register
> I1022 18:42:26.855815 138 master.cpp:467] Using default 'crammd5'
> authenticator
> W1022 18:42:26.855829 138 authenticator.cpp:505] No credentials provided,
> authentication requests will be refused
> I1022 18:42:26.855840 138 authenticator.cpp:512] Initializing server SASL
> I1022 18:42:26.856442 136 containerizer.cpp:143] Using isolation:
> posix/cpu,posix/mem,filesystem/posix
> I1022 18:42:26.856943 140 leveldb.cpp:306] Persisting metadata (8 bytes) to
> leveldb took 1.888185ms
> I1022 18:42:26.856987 140 replica.cpp:323] Persisted replica status to
> STARTING
> I1022 18:42:26.857115 140 recover.cpp:475] Replica is in STARTING status
> I1022 18:42:26.857270 140 replica.cpp:641] Replica in STARTING status
> received a broadcasted recover request
> I1022 18:42:26.857312 140 recover.cpp:195] Received a recover response from
> a replica in STARTING status
> I1022 18:42:26.857368 140 recover.cpp:566] Updating replica status to VOTING
> I1022 18:42:26.857781 140 leveldb.cpp:306] Persisting metadata (8 bytes) to
> leveldb took 371121ns
> I1022 18:42:26.857841 140 replica.cpp:323] Persisted replica status to
> VOTING
> I1022 18:42:26.857895 140 recover.cpp:580] Successfully joined the Paxos
> group
> I1022 18:42:26.857928 140 recover.cpp:464] Recover process terminated
> I1022 18:42:26.862455 137 master.cpp:1603] The newly elected leader is
> [email protected]:5050 with id a3f39818-1bda-4710-b96b-2a60ed4d12b8
> I1022 18:42:26.862498 137 master.cpp:1616] Elected as the leading master!
> I1022 18:42:26.862511 137 master.cpp:1376] Recovering from registrar
> I1022 18:42:26.862560 137 registrar.cpp:309] Recovering registrar
> Failed to create a containerizer: Could not create MesosContainerizer: Failed
> to create launcher: Failed to create Linux launcher: Failed to mount cgroups
> hierarchy at '/sys/fs/cgroup/freezer': 'freezer' is already attached to
> another hierarchy
> {noformat}
> The setup worked with mesos 0.24.0.
> The Dockerfile is here:
> https://github.com/mesosphere/marathon/blob/mv/mesos_0.25/Dockerfile
> {noformat}
> root@a06e4b4eb776:/marathon# ls /sys/fs/cgroup/
> root@a06e4b4eb776:/marathon#
> {noformat}
> {noformat}
> root@a06e4b4eb776:/marathon# cat /proc/mounts
> none / aufs rw,relatime,si=6e7ac87f36042e03,dio,dirperm1 0 0
> proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
> tmpfs /dev tmpfs rw,nosuid,mode=755 0 0
> devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666
> 0 0
> shm /dev/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=65536k 0 0
> mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
> sysfs /sys sysfs ro,nosuid,nodev,noexec,relatime 0 0
> /dev/sda1 /etc/resolv.conf ext4 rw,relatime,data=ordered 0 0
> /dev/sda1 /etc/hostname ext4 rw,relatime,data=ordered 0 0
> /dev/sda1 /etc/hosts ext4 rw,relatime,data=ordered 0 0
> devpts /dev/console devpts rw,relatime,mode=600,ptmxmode=000 0 0
> proc /proc/bus proc ro,nosuid,nodev,noexec,relatime 0 0
> proc /proc/fs proc ro,nosuid,nodev,noexec,relatime 0 0
> proc /proc/irq proc ro,nosuid,nodev,noexec,relatime 0 0
> proc /proc/sys proc ro,nosuid,nodev,noexec,relatime 0 0
> proc /proc/sysrq-trigger proc ro,nosuid,nodev,noexec,relatime 0 0
> tmpfs /proc/kcore tmpfs rw,nosuid,mode=755 0 0
> tmpfs /proc/timer_stats tmpfs rw,nosuid,mode=755 0 0
> {noformat}
> [~bernd-mesos] Can you please assign to the correct person?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)