[jira] [Commented] (MESOS-4160) Log recover tests are slow
[ https://issues.apache.org/jira/browse/MESOS-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140992#comment-15140992 ] Shuai Lin commented on MESOS-4160: -- The slowness is because replica1 and replica2 is is {{EMPTY}} status and retries with random backoff of {{[0.5 sec, 1 sec]}}. Currently the retry interval is hard coded and not configurable. https://github.com/apache/mesos/blob/0.27.0/src/log/recover.cpp#L328-L339 > Log recover tests are slow > -- > > Key: MESOS-4160 > URL: https://issues.apache.org/jira/browse/MESOS-4160 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Shuai Lin >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > On Mac OS 10.10.4, some tests take longer than {{1s}} to finish: > {code} > RecoverTest.AutoInitialization (1003 ms) > RecoverTest.AutoInitializationRetry (1000 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4157) Speed up ZooKeeper-related tests
[ https://issues.apache.org/jira/browse/MESOS-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140773#comment-15140773 ] haosdent commented on MESOS-4157: - Most slow zookeeper test cases is because of [ZOOKEEPER-770|https://issues.apache.org/jira/browse/ZOOKEEPER-770] > Speed up ZooKeeper-related tests > > > Key: MESOS-4157 > URL: https://issues.apache.org/jira/browse/MESOS-4157 > Project: Mesos > Issue Type: Epic > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: haosdent >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > Execution times on Mac OS 10.10.4: > {code} > ZooKeeperTest.Auth (6688 ms) > ZooKeeperTest.Create (6690 ms) > ZooKeeperTest.LeaderContender (3385 ms) > MasterZooKeeperTest.MasterInfoAddress (11282 ms) > ZooKeeperMasterContenderDetectorTest.NonRetryableFrrors (10053 ms) > ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork (3390 > ms) > ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession (3358 > ms) > ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster > (3359 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4635) CoordinatorTest.AppendDiscarded is flaky
Greg Mann created MESOS-4635: Summary: CoordinatorTest.AppendDiscarded is flaky Key: MESOS-4635 URL: https://issues.apache.org/jira/browse/MESOS-4635 Project: Mesos Issue Type: Bug Affects Versions: 0.27.0 Environment: Ubuntu 14.04 with clang Reporter: Greg Mann Just saw this failure on the ASF Jenkins CI: {code} [ RUN ] CoordinatorTest.AppendDiscarded I0210 09:34:39.188288 31550 leveldb.cpp:174] Opened db in 2.043145ms I0210 09:34:39.189136 31550 leveldb.cpp:181] Compacted db in 811003ns I0210 09:34:39.189182 31550 leveldb.cpp:196] Created db iterator in 27506ns I0210 09:34:39.189208 31550 leveldb.cpp:202] Seeked to beginning of db in 10415ns I0210 09:34:39.189224 31550 leveldb.cpp:271] Iterated through 0 keys in the db in 8230ns I0210 09:34:39.189260 31550 replica.cpp:779] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I0210 09:34:39.190004 31577 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 471666ns I0210 09:34:39.190028 31577 replica.cpp:320] Persisted replica status to VOTING I0210 09:34:39.192812 31550 leveldb.cpp:174] Opened db in 2.215995ms I0210 09:34:39.193488 31550 leveldb.cpp:181] Compacted db in 660244ns I0210 09:34:39.193528 31550 leveldb.cpp:196] Created db iterator in 23068ns I0210 09:34:39.193554 31550 leveldb.cpp:202] Seeked to beginning of db in 10451ns I0210 09:34:39.193570 31550 leveldb.cpp:271] Iterated through 0 keys in the db in 7996ns I0210 09:34:39.193603 31550 replica.cpp:779] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I0210 09:34:39.194510 31569 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 393072ns I0210 09:34:39.194537 31569 replica.cpp:320] Persisted replica status to VOTING I0210 09:34:39.196895 31550 leveldb.cpp:174] Opened db in 1.804552ms I0210 09:34:39.198554 31550 leveldb.cpp:181] Compacted db in 1.642208ms I0210 09:34:39.198593 31550 leveldb.cpp:196] Created db iterator in 19381ns I0210 09:34:39.198633 31550 leveldb.cpp:202] Seeked to beginning of db in 35677ns I0210 09:34:39.198673 31550 leveldb.cpp:271] Iterated through 1 keys in the db in 26460ns I0210 09:34:39.198703 31550 replica.cpp:779] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I0210 09:34:39.200898 31550 leveldb.cpp:174] Opened db in 2.09532ms I0210 09:34:39.202641 31550 leveldb.cpp:181] Compacted db in 1.7251ms I0210 09:34:39.202697 31550 leveldb.cpp:196] Created db iterator in 39337ns I0210 09:34:39.202836 31550 leveldb.cpp:202] Seeked to beginning of db in 34194ns I0210 09:34:39.202965 31550 leveldb.cpp:271] Iterated through 1 keys in the db in 39383ns I0210 09:34:39.203088 31550 replica.cpp:779] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I0210 09:34:39.204413 31573 replica.cpp:493] Replica received implicit promise request from (2636)@172.17.0.2:58132 with proposal 1 I0210 09:34:39.204494 31572 replica.cpp:493] Replica received implicit promise request from (2637)@172.17.0.2:58132 with proposal 1 I0210 09:34:39.204854 31573 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 417201ns I0210 09:34:39.204880 31573 replica.cpp:342] Persisted promised to 1 I0210 09:34:39.205060 31572 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 471800ns I0210 09:34:39.205087 31572 replica.cpp:342] Persisted promised to 1 I0210 09:34:39.205577 31582 coordinator.cpp:238] Coordinator attempting to fill missing positions I0210 09:34:39.206393 31579 replica.cpp:388] Replica received explicit promise request from (2638)@172.17.0.2:58132 for position 0 with proposal 2 I0210 09:34:39.206569 31578 replica.cpp:388] Replica received explicit promise request from (2639)@172.17.0.2:58132 for position 0 with proposal 2 I0210 09:34:39.206840 31579 leveldb.cpp:341] Persisting action (8 bytes) to leveldb took 335263ns I0210 09:34:39.206881 31579 replica.cpp:712] Persisted action at 0 I0210 09:34:39.207236 31578 leveldb.cpp:341] Persisting action (8 bytes) to leveldb took 442481ns I0210 09:34:39.207258 31578 replica.cpp:712] Persisted action at 0 I0210 09:34:39.208065 31577 replica.cpp:537] Replica received write request for position 0 from (2640)@172.17.0.2:58132 I0210 09:34:39.208160 31568 replica.cpp:537] Replica received write request for position 0 from (2641)@172.17.0.2:58132 I0210 09:34:39.208206 31568 leveldb.cpp:436] Reading position from leveldb took 67699ns I0210 09:34:39.208117 31577 leveldb.cpp:436] Reading position from leveldb took 225587ns I0210 09:34:39.208647 31568 leveldb.cpp:341] Persisting action (14 bytes) to leveldb took 374594ns I0210 09:34:39.208652 31577 leveldb.cpp:341] Persisting action (14 bytes) to leveldb took 317146ns I0210 09:34:39.208673 31568 replica.cpp:712] Persisted action at 0 I0210 09:34:39.208683 31577 replica.cpp:712] Persisted action at 0 I0210 09:34:39.209205
[jira] [Commented] (MESOS-3833) /help endpoints do not work for nested paths
[ https://issues.apache.org/jira/browse/MESOS-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140768#comment-15140768 ] Guangya Liu commented on MESOS-3833: Thanks [~bmahler] and [~vinodkone] , the patch was updated, can you please help take a look again? Thanks ;-) > /help endpoints do not work for nested paths > > > Key: MESOS-3833 > URL: https://issues.apache.org/jira/browse/MESOS-3833 > Project: Mesos > Issue Type: Bug > Components: HTTP API >Reporter: Anand Mazumdar >Assignee: Guangya Liu >Priority: Minor > Labels: mesosphere, newbie > > Mesos displays the list of all supported endpoints starting at a given path > prefix using the {{/help}} suffix, e.g. {{master:5050/help}}. > It seems that the {{help}} functionality is broken for URL's having nested > paths e.g. {{master:5050/help/master/machine/down}}. The response returned is: > {quote} > Malformed URL, expecting '/help/id/name/' > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4157) Speed up ZooKeeper-related tests
[ https://issues.apache.org/jira/browse/MESOS-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141099#comment-15141099 ] haosdent commented on MESOS-4157: - After apply that patch: {code} [ OK ] ZooKeeperMasterContenderDetectorTest.NonRetryableFrrors (321 ms) [ OK ] ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork (3567 ms) [ OK ] ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession (3526 ms) [ OK ] ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster (3591 ms) [ OK ] MasterZooKeeperTest.MasterInfoAddress (447 ms) [ OK ] ZooKeeperTest.Auth (233 ms) [ OK ] ZooKeeperTest.Create (275 ms) [ OK ] ZooKeeperTest.LeaderContender (7233 ms) {code} > Speed up ZooKeeper-related tests > > > Key: MESOS-4157 > URL: https://issues.apache.org/jira/browse/MESOS-4157 > Project: Mesos > Issue Type: Epic > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: haosdent >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > Execution times on Mac OS 10.10.4: > {code} > ZooKeeperTest.Auth (6688 ms) > ZooKeeperTest.Create (6690 ms) > ZooKeeperTest.LeaderContender (3385 ms) > MasterZooKeeperTest.MasterInfoAddress (11282 ms) > ZooKeeperMasterContenderDetectorTest.NonRetryableFrrors (10053 ms) > ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork (3390 > ms) > ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession (3358 > ms) > ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster > (3359 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4640) Logrotate container logger can die with agent unit on systemd.
[ https://issues.apache.org/jira/browse/MESOS-4640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-4640: Summary: Logrotate container logger can die with agent unit on systemd. (was: Logrotate container logger is associated with agent unit on systemd.) > Logrotate container logger can die with agent unit on systemd. > -- > > Key: MESOS-4640 > URL: https://issues.apache.org/jira/browse/MESOS-4640 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.0 >Reporter: Joris Van Remoortere >Assignee: Joris Van Remoortere > Labels: mesosphere > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4370) NetworkSettings.IPAddress field is deprecated in Docker
[ https://issues.apache.org/jira/browse/MESOS-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141205#comment-15141205 ] Kapil Arya commented on MESOS-4370: --- [~travis.hegner]: Can you take a look at the reviews and address them? It's almost there :-). > NetworkSettings.IPAddress field is deprecated in Docker > --- > > Key: MESOS-4370 > URL: https://issues.apache.org/jira/browse/MESOS-4370 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0, 0.26.0, 0.27.0 > Environment: Ubuntu 14.04 > Docker 1.9.1 >Reporter: Clint Armstrong >Assignee: Travis Hegner > > The latest docker API deprecates the NetworkSettings.IPAddress field, in > favor of the NetworkSettings.Networks field. > https://docs.docker.com/engine/reference/api/docker_remote_api/#v1-21-api-changes > With this deprecation, NetworkSettings.IPAddress is not populated for > containers running with networks that use new network plugins. > As a result the mesos API has no data in > container_status.network_infos.ip_address or > container_status.network_infos.ipaddresses. > The immediate impact of this is that mesos-dns is unable to retrieve a > containers IP from the netinfo interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4638) versioning preprocessor macros
James Peach created MESOS-4638: -- Summary: versioning preprocessor macros Key: MESOS-4638 URL: https://issues.apache.org/jira/browse/MESOS-4638 Project: Mesos Issue Type: Bug Components: c++ api Reporter: James Peach The macros in {{version.hpp}} cannot be used for conditional build because they are strings not integers. It would be helpful to have integer versions of these for conditionally building code against different versions of the Mesos API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1778) Provide an option to validate flag value in stout/flags.
[ https://issues.apache.org/jira/browse/MESOS-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141129#comment-15141129 ] Isabel Jimenez commented on MESOS-1778: --- This was added in: https://github.com/apache/mesos/commit/5596233382844da05b82a7769c726a8cdd1bfa17 > Provide an option to validate flag value in stout/flags. > - > > Key: MESOS-1778 > URL: https://issues.apache.org/jira/browse/MESOS-1778 > Project: Mesos > Issue Type: Improvement > Components: stout >Reporter: Alexander Rukletsov >Assignee: Isabel Jimenez >Priority: Minor > > Currently we can provide the default value for a flag, but cannot check if > the flag is set to a reasonable value and, e.g., issue a warning. Passing an > optional lambda checker to {{FlagBase::add()}} can be a possible solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4639) Posix process executor is associated with agent unit on systemd.
Joris Van Remoortere created MESOS-4639: --- Summary: Posix process executor is associated with agent unit on systemd. Key: MESOS-4639 URL: https://issues.apache.org/jira/browse/MESOS-4639 Project: Mesos Issue Type: Bug Affects Versions: 0.25.0, 0.26.0, 0.27.0 Reporter: Joris Van Remoortere Assignee: Joris Van Remoortere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4640) Logrotate container logger is associated with agent unit on systemd.
Joris Van Remoortere created MESOS-4640: --- Summary: Logrotate container logger is associated with agent unit on systemd. Key: MESOS-4640 URL: https://issues.apache.org/jira/browse/MESOS-4640 Project: Mesos Issue Type: Bug Affects Versions: 0.27.0 Reporter: Joris Van Remoortere Assignee: Joris Van Remoortere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4615) ContainerLoggerTest.DefaultToSandbox is flaky
[ https://issues.apache.org/jira/browse/MESOS-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-4615: - Labels: flaky-test logger mesosphere (was: flaky-test mesosphere) > ContainerLoggerTest.DefaultToSandbox is flaky > - > > Key: MESOS-4615 > URL: https://issues.apache.org/jira/browse/MESOS-4615 > Project: Mesos > Issue Type: Bug > Components: tests >Affects Versions: 0.27.0 > Environment: CentOS 7, gcc, libevent & SSL enabled >Reporter: Greg Mann >Assignee: Joseph Wu > Labels: flaky-test, logger, mesosphere > > Just saw this failure on the ASF CI: > {code} > [ RUN ] ContainerLoggerTest.DefaultToSandbox > I0206 01:25:03.766458 2824 leveldb.cpp:174] Opened db in 72.979786ms > I0206 01:25:03.811712 2824 leveldb.cpp:181] Compacted db in 45.162067ms > I0206 01:25:03.811810 2824 leveldb.cpp:196] Created db iterator in 26090ns > I0206 01:25:03.811828 2824 leveldb.cpp:202] Seeked to beginning of db in > 3173ns > I0206 01:25:03.811839 2824 leveldb.cpp:271] Iterated through 0 keys in the > db in 497ns > I0206 01:25:03.811900 2824 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0206 01:25:03.812785 2849 recover.cpp:447] Starting replica recovery > I0206 01:25:03.813043 2849 recover.cpp:473] Replica is in EMPTY status > I0206 01:25:03.814668 2854 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (371)@172.17.0.8:37843 > I0206 01:25:03.815210 2849 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0206 01:25:03.815732 2854 recover.cpp:564] Updating replica status to > STARTING > I0206 01:25:03.819664 2857 master.cpp:376] Master > 914b62f9-95f6-4c57-a7e3-9b06e2c1c8de (74ef606c4063) started on > 172.17.0.8:37843 > I0206 01:25:03.819703 2857 master.cpp:378] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/h5vu5I/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="100secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.28.0/_inst/share/mesos/webui" > --work_dir="/tmp/h5vu5I/master" --zk_session_timeout="10secs" > I0206 01:25:03.820241 2857 master.cpp:423] Master only allowing > authenticated frameworks to register > I0206 01:25:03.820257 2857 master.cpp:428] Master only allowing > authenticated slaves to register > I0206 01:25:03.820269 2857 credentials.hpp:35] Loading credentials for > authentication from '/tmp/h5vu5I/credentials' > I0206 01:25:03.821110 2857 master.cpp:468] Using default 'crammd5' > authenticator > I0206 01:25:03.821311 2857 master.cpp:537] Using default 'basic' HTTP > authenticator > I0206 01:25:03.821636 2857 master.cpp:571] Authorization enabled > I0206 01:25:03.821979 2846 hierarchical.cpp:144] Initialized hierarchical > allocator process > I0206 01:25:03.822057 2846 whitelist_watcher.cpp:77] No whitelist given > I0206 01:25:03.825460 2847 master.cpp:1712] The newly elected leader is > master@172.17.0.8:37843 with id 914b62f9-95f6-4c57-a7e3-9b06e2c1c8de > I0206 01:25:03.825512 2847 master.cpp:1725] Elected as the leading master! > I0206 01:25:03.825533 2847 master.cpp:1470] Recovering from registrar > I0206 01:25:03.825835 2847 registrar.cpp:307] Recovering registrar > I0206 01:25:03.848212 2854 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 32.226093ms > I0206 01:25:03.848299 2854 replica.cpp:320] Persisted replica status to > STARTING > I0206 01:25:03.848702 2854 recover.cpp:473] Replica is in STARTING status > I0206 01:25:03.850728 2858 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (373)@172.17.0.8:37843 > I0206 01:25:03.851230 2854 recover.cpp:193] Received a recover response from > a replica in STARTING status > I0206 01:25:03.852018 2854 recover.cpp:564] Updating replica status to VOTING > I0206 01:25:03.881681 2854 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 29.184163ms > I0206 01:25:03.881772 2854 replica.cpp:320] Persisted replica status to
[jira] [Updated] (MESOS-4637) Docker process executor can die with agent unit on systemd.
[ https://issues.apache.org/jira/browse/MESOS-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-4637: Summary: Docker process executor can die with agent unit on systemd. (was: Docker process executor is associated with agent unit on systemd.) > Docker process executor can die with agent unit on systemd. > --- > > Key: MESOS-4637 > URL: https://issues.apache.org/jira/browse/MESOS-4637 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.25.0, 0.26.0, 0.27.0 >Reporter: Joris Van Remoortere >Assignee: Joris Van Remoortere > Labels: mesosphere > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4639) Posix process executor can die with agent unit on systemd.
[ https://issues.apache.org/jira/browse/MESOS-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-4639: Summary: Posix process executor can die with agent unit on systemd. (was: Posix process executor is associated with agent unit on systemd.) > Posix process executor can die with agent unit on systemd. > -- > > Key: MESOS-4639 > URL: https://issues.apache.org/jira/browse/MESOS-4639 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.25.0, 0.26.0, 0.27.0 >Reporter: Joris Van Remoortere >Assignee: Joris Van Remoortere > Labels: mesosphere > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4636) Add parent hook to subprocess.
Joris Van Remoortere created MESOS-4636: --- Summary: Add parent hook to subprocess. Key: MESOS-4636 URL: https://issues.apache.org/jira/browse/MESOS-4636 Project: Mesos Issue Type: Improvement Components: libprocess Reporter: Joris Van Remoortere Assignee: Joris Van Remoortere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4637) Docker process executor is associated with agent unit on systemd.
Joris Van Remoortere created MESOS-4637: --- Summary: Docker process executor is associated with agent unit on systemd. Key: MESOS-4637 URL: https://issues.apache.org/jira/browse/MESOS-4637 Project: Mesos Issue Type: Bug Components: docker Affects Versions: 0.25.0, 0.26.0, 0.27.0 Reporter: Joris Van Remoortere Assignee: Joris Van Remoortere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4358) Expose net_cls network handles in agent's state endpoint
[ https://issues.apache.org/jira/browse/MESOS-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-4358: -- Sprint: Mesosphere Sprint 28 > Expose net_cls network handles in agent's state endpoint > > > Key: MESOS-4358 > URL: https://issues.apache.org/jira/browse/MESOS-4358 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Avinash Sridharan >Assignee: Avinash Sridharan > Labels: container, containerizer, mesosphere > Fix For: 0.28.0 > > > We need to expose net_cls network handles, associated with containers, to > operators and network utilities that would use these network handles to > enforce network policy. > In order to achieve the above we need to add a new field in the `NetworkInfo` > protobuf (say NetHandles) and update this field when a container gets > assigned to a net_cls cgroup. The `ContainerStatus` protobuf already has the > `NetworkInfo` protobuf as a nested message, and the `ContainerStatus` itself > is exposed to operators as part of TaskInfo (for tasks associated with the > container) in an agent's state.json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4641) Support Container Network Interface (CNI).
Jie Yu created MESOS-4641: - Summary: Support Container Network Interface (CNI). Key: MESOS-4641 URL: https://issues.apache.org/jira/browse/MESOS-4641 Project: Mesos Issue Type: Epic Reporter: Jie Yu CoreOS developed the Container Network Interface (CNI), a proposed standard for configuring network interfaces for Linux containers. Many CNI plugins (e.g., calico) have already been developed. https://coreos.com/blog/rkt-cni-networking.html https://github.com/appc/cni/blob/master/SPEC.md Also, Kubernetes claimed that they'll support CNI as well. http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html In the context of Unified Containerizer, it would be nice if we can have a 'network/cni' isolator which will speak the CNI protocol and prepare the network for the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3007) Support systemd with Mesos containerizer
[ https://issues.apache.org/jira/browse/MESOS-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141386#comment-15141386 ] Joris Van Remoortere commented on MESOS-3007: - [~dennyhina] Can you provide context / logs for why the systemd existence check passes, yet {{systemctl}} is not available or failed? How are you identifying whether systemd is running? > Support systemd with Mesos containerizer > > > Key: MESOS-3007 > URL: https://issues.apache.org/jira/browse/MESOS-3007 > Project: Mesos > Issue Type: Epic >Reporter: Artem Harutyunyan >Assignee: Joris Van Remoortere > Labels: mesosphere > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3486) Use DROP_PROTOBUF instead of DROP_MESSAGE in tests
[ https://issues.apache.org/jira/browse/MESOS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141441#comment-15141441 ] Michael Browning commented on MESOS-3486: - This seems pretty straightforward -- `DROP_MESSAGE(S)` and `FUTURE_MESSAGE` are macros defined in `process/gmock.hpp`, so I'll update the definitions and then any further invocations of those macros in the test suite. > Use DROP_PROTOBUF instead of DROP_MESSAGE in tests > -- > > Key: MESOS-3486 > URL: https://issues.apache.org/jira/browse/MESOS-3486 > Project: Mesos > Issue Type: Improvement >Reporter: Neil Conway >Assignee: Michael Browning >Priority: Trivial > Labels: mesosphere, newbie, tests > > The tests use DROP_MESSAGE(), DROP_MESSAGES(), and FUTURE_MESSAGE() in > various places where it would be more clear and concise to use > DROP_PROTOBUF(), DROP_PROTOBUFS(), and FUTURE_PROTOBUF() instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3486) Use DROP_PROTOBUF instead of DROP_MESSAGE in tests
[ https://issues.apache.org/jira/browse/MESOS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Browning reassigned MESOS-3486: --- Assignee: Michael Browning > Use DROP_PROTOBUF instead of DROP_MESSAGE in tests > -- > > Key: MESOS-3486 > URL: https://issues.apache.org/jira/browse/MESOS-3486 > Project: Mesos > Issue Type: Improvement >Reporter: Neil Conway >Assignee: Michael Browning >Priority: Trivial > Labels: mesosphere, newbie, tests > > The tests use DROP_MESSAGE(), DROP_MESSAGES(), and FUTURE_MESSAGE() in > various places where it would be more clear and concise to use > DROP_PROTOBUF(), DROP_PROTOBUFS(), and FUTURE_PROTOBUF() instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4642) Mesos Agent Json API can dump binary data from log files out as invalid JSON
Steven Schlansker created MESOS-4642: Summary: Mesos Agent Json API can dump binary data from log files out as invalid JSON Key: MESOS-4642 URL: https://issues.apache.org/jira/browse/MESOS-4642 Project: Mesos Issue Type: Bug Components: json api, slave Affects Versions: 0.27.0 Reporter: Steven Schlansker Priority: Critical One of our tasks accidentally started logging binary data to stderr. This was not intentional and generally should not happen -- however, it causes severe problems with the Mesos Agent "files/read.json" API, since it gladly dumps this binary data out as invalid JSON. {code} # hexdump -C /path/to/task/stderr | tail 0003d1f0 6f 6e 6e 65 63 74 69 6f 6e 0a 4e 45 54 3a 20 31 |onnection.NET: 1| 0003d200 20 6f 6e 72 65 61 64 20 45 4e 4f 45 4e 54 20 32 | onread ENOENT 2| 0003d210 39 35 34 35 36 20 32 35 31 20 32 39 35 37 30 37 |95456 251 295707| 0003d220 0a 01 00 00 00 00 00 00 ac 57 65 64 2c 20 31 30 |.Wed, 10| 0003d230 20 55 6e 72 65 63 6f 67 6e 69 7a 65 64 20 69 6e | Unrecognized in| 0003d240 70 75 74 20 68 65 61 64 65 72 0a |put header.| {code} {code} # curl 'http://agent-host:5051/files/read.json?path=/path/to/task/stderr=220443=9=' | hexdump -C 7970 6e 65 63 74 69 6f 6e 5c 6e 4e 45 54 3a 20 31 20 |nection\nNET: 1 | 7980 6f 6e 72 65 61 64 20 45 4e 4f 45 4e 54 20 32 39 |onread ENOENT 29| 7990 35 34 35 36 20 32 35 31 20 32 39 35 37 30 37 5c |5456 251 295707\| 79a0 6e 5c 75 30 30 30 31 5c 75 30 30 30 30 5c 75 30 |n\u0001\u\u0| 79b0 30 30 30 5c 75 30 30 30 30 5c 75 30 30 30 30 5c |000\u\u\| 79c0 75 30 30 30 30 5c 75 30 30 30 30 ac 57 65 64 2c |u\u.Wed,| 79d0 20 31 30 20 55 6e 72 65 63 6f 67 6e 69 7a 65 64 | 10 Unrecognized| 79e0 20 69 6e 70 75 74 20 68 65 61 64 65 72 5c 6e 22 | input header\n"| 79f0 2c 22 6f 66 66 73 65 74 22 3a 32 32 30 34 34 33 |,"offset":220443| 7a00 7d|}| {code} This causes downstream sadness: {code} ERROR [2016-02-10 18:55:12,303] io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 0ee749630f8b26f1 ! com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0xac ! at [Source: org.jboss.netty.buffer.ChannelBufferInputStream@6d69ee8; line: 1, column: 31181] ! at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1487) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:518) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3339) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidChar(UTF8StreamJsonParser.java:) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2360) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2287) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:286) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:29) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:12) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:523) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:381) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1073) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.module.afterburner.deser.SuperSonicBeanDeserializer.deserializeFromObject(SuperSonicBeanDeserializer.java:196) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:142) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.module.afterburner.deser.SuperSonicBeanDeserializer.deserialize(SuperSonicBeanDeserializer.java:117) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3562) ~[singularity-0.4.9.jar:0.4.9] ! at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2648) ~[singularity-0.4.9.jar:0.4.9] ! at com.hubspot.singularity.data.SandboxManager.read(SandboxManager.java:97) ~[singularity-0.4.9.jar:0.4.9] {code}
[jira] [Commented] (MESOS-4631) Document how to use custom authentication modules
[ https://issues.apache.org/jira/browse/MESOS-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141378#comment-15141378 ] Disha Singh commented on MESOS-4631: I would like to take up this issue. Can anyone be the shepherd for this ? Thanks. > Document how to use custom authentication modules > - > > Key: MESOS-4631 > URL: https://issues.apache.org/jira/browse/MESOS-4631 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Neil Conway >Priority: Minor > Labels: authentication, documentation, mesosphere > > The authentication doc page talks about custom authentication modules a bit, > but doesn't give enough information. For example: > * What interface does a custom authentication module need to satisfy? > * Can multiple authentication modules be used? > * How do I implement a framework that authenticates with a master that uses a > non-default authentication module, e.g., one that doesn't use credentials? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4343) Introduce the ability to assign network handles to mesos containers
[ https://issues.apache.org/jira/browse/MESOS-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avinash Sridharan updated MESOS-4343: - Description: Linux provides net_cls as a cgroup subsystem. A net_cls cgroup is associated with a 16-bit major handle and a 16-bit minor handle. When a task is associated with a net_cls cgroup, the kernel tags every packet being generated by the task with the major and minor handle associated with the net_cls cgroup. These tags are then used by network performance shaping and firewall tools such as tc (traffic controller) and iptables. Currently, mesos agents do not provide any isolator that can enable mesos-containers in a net_cls cgroup, or assign network handles to a net_cls cgroup. As part of this epic we plan to achieve the following: a) Implement net_cls cgroup isolator for mesos agents. b) Implement a manager for the net_cls handles. c) Allow operators to set a major network handle when launching an agent. d) Expose the net_cls network handle allocated to a container, to entities such as operators and frameworks. Once the above goals are met operators can learn about network handles allocated to containers and apply them to tools such as tc and iptables to enforce network policies. was: Linux provides net_cls as a cgroup subsystem. A net_cls cgroup is associated with a 16-bit major handle and a 16-bit minor handle. When a task is associated with a net_cls cgroup, the kernel tags every packet being generated by the task with the major and minor handle associated with the net_cls cgroup that the task belongs too. These tags are then used by network performance shaping and firewall tools such as tc (traffic controller) and iptables. Currently, mesos agents do not provide any isolator that can enable mesos-containers in a net_cls cgroup, or assign network handles to a net_cls cgroup. As part of this epic we plan to achieve the following: a) Implement net_cls cgroup isolator for mesos agents. b) Implement an net-handles allocator class that can manage. c) Allow operators to set a major network handle when launching an agent. d) Expose the net_cls network handle allocated to a container, to entities such as operators and frameworks. Once the above goals are met operators can learn about network handles allocated to containers and apply them to tools such as tc and iptables to enforce network policies. > Introduce the ability to assign network handles to mesos containers > --- > > Key: MESOS-4343 > URL: https://issues.apache.org/jira/browse/MESOS-4343 > Project: Mesos > Issue Type: Epic > Components: containerization >Reporter: Avinash Sridharan >Assignee: Avinash Sridharan > Labels: containers, mesosphere > Fix For: 0.28.0 > > > Linux provides net_cls as a cgroup subsystem. A net_cls cgroup is associated > with a 16-bit major handle and a 16-bit minor handle. When a task is > associated with a net_cls cgroup, the kernel tags every packet being > generated by the task with the major and minor handle associated with the > net_cls cgroup. These tags are then used by network performance shaping and > firewall tools such as tc (traffic controller) and iptables. > Currently, mesos agents do not provide any isolator that can enable > mesos-containers in a net_cls cgroup, or assign network handles to a net_cls > cgroup. As part of this epic we plan to achieve the following: > a) Implement net_cls cgroup isolator for mesos agents. > b) Implement a manager for the net_cls handles. > c) Allow operators to set a major network handle when launching an agent. > d) Expose the net_cls network handle allocated to a container, to entities > such as operators and frameworks. > Once the above goals are met operators can learn about network handles > allocated to containers and apply them to tools such as tc and iptables to > enforce network policies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4633) Tests will dereference stack allocated agent objects upon assertion/expectation failure.
[ https://issues.apache.org/jira/browse/MESOS-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-4633: - Shepherd: Bernd Mathiske Labels: flaky mesosphere tech-debt test (was: flaky mesosphere test) > Tests will dereference stack allocated agent objects upon > assertion/expectation failure. > > > Key: MESOS-4633 > URL: https://issues.apache.org/jira/browse/MESOS-4633 > Project: Mesos > Issue Type: Bug >Reporter: Joseph Wu >Assignee: Joseph Wu > Labels: flaky, mesosphere, tech-debt, test > > Tests that use the {{StartSlave}} test helper are generally fragile when the > test fails an assert/expect in the middle of the test. This is because the > {{StartSlave}} helper takes raw pointer arguments, which may be > stack-allocated. > In case of an assert failure, the test immediately exits (destroying stack > allocated objects) and proceeds onto test cleanup. The test cleanup may > dereference some of these destroyed objects, leading to a test crash like: > {code} > [18:27:36][Step 8/8] F0204 18:27:35.981302 23085 logging.cpp:64] RAW: Pure > virtual method called > [18:27:36][Step 8/8] @ 0x7f7077055e1c google::LogMessage::Fail() > [18:27:36][Step 8/8] @ 0x7f707705ba6f google::RawLog__() > [18:27:36][Step 8/8] @ 0x7f70760f76c9 __cxa_pure_virtual > [18:27:36][Step 8/8] @ 0xa9423c > mesos::internal::tests::Cluster::Slaves::shutdown() > [18:27:36][Step 8/8] @ 0x1074e45 > mesos::internal::tests::MesosTest::ShutdownSlaves() > [18:27:36][Step 8/8] @ 0x1074de4 > mesos::internal::tests::MesosTest::Shutdown() > [18:27:36][Step 8/8] @ 0x1070ec7 > mesos::internal::tests::MesosTest::TearDown() > {code} > The {{StartSlave}} helper should take {{shared_ptr}} arguments instead. > This also means that we can remove the {{Shutdown}} helper from most of these > tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4446) Set Docker labels based on TaskInfo labels.
[ https://issues.apache.org/jira/browse/MESOS-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141647#comment-15141647 ] Martin Evgeniev commented on MESOS-4446: Many thanks for the info [~gyliu]. I'll open a JIRA to request some docs.. > Set Docker labels based on TaskInfo labels. > --- > > Key: MESOS-4446 > URL: https://issues.apache.org/jira/browse/MESOS-4446 > Project: Mesos > Issue Type: Story > Components: docker >Reporter: Gennady Feldman >Assignee: Abhishek Dasgupta > > So looks like MESOS-3076 added support for Labels to TaskStatus. Would it be > possible to pass those onto the docker container? > This would really help with doing "docker inspect" on the mesos-slave nodes > as well as allow us to better collect docker metrics about the > tasks/containers that are currently running on the slave. > docker supports labels out of the box. See here: > https://docs.docker.com/engine/userguide/labels-custom-metadata/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4636) Add parent hook to subprocess.
[ https://issues.apache.org/jira/browse/MESOS-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141883#comment-15141883 ] Joseph Wu commented on MESOS-4636: -- Fix for root tests on systemd platforms: https://reviews.apache.org/r/43432/ > Add parent hook to subprocess. > -- > > Key: MESOS-4636 > URL: https://issues.apache.org/jira/browse/MESOS-4636 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Joris Van Remoortere >Assignee: Joris Van Remoortere > Labels: mesosphere > Fix For: 0.27.1, 0.28.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4157) Speed up ZooKeeper-related tests
[ https://issues.apache.org/jira/browse/MESOS-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141298#comment-15141298 ] haosdent commented on MESOS-4157: - The remain long running tests are because wait for expire session. We could advance time here. {code} [ OK ] ZooKeeperTest.LeaderContender (7233 ms) [ OK ] ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork (3567 ms) [ OK ] ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession (3526 ms) [ OK ] ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster (3591 ms) {code} > Speed up ZooKeeper-related tests > > > Key: MESOS-4157 > URL: https://issues.apache.org/jira/browse/MESOS-4157 > Project: Mesos > Issue Type: Epic > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: haosdent >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > Execution times on Mac OS 10.10.4: > {code} > ZooKeeperTest.Auth (6688 ms) > ZooKeeperTest.Create (6690 ms) > ZooKeeperTest.LeaderContender (3385 ms) > MasterZooKeeperTest.MasterInfoAddress (11282 ms) > ZooKeeperMasterContenderDetectorTest.NonRetryableFrrors (10053 ms) > ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork (3390 > ms) > ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession (3358 > ms) > ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster > (3359 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4542) MasterQuotaTest.AvailableResourcesAfterRescinding is flaky.
[ https://issues.apache.org/jira/browse/MESOS-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142051#comment-15142051 ] Michael Park commented on MESOS-4542: - {noformat} commit 84c6b714dd5df836a7943493562683ed41f7f396 Author: Alexander RukletsovDate: Wed Feb 10 13:38:29 2016 -0800 Fixed a flaky test in quota tests. The `AvailableResourcesAfterRescinding` test became flaky after we stopped offering unreserved resources beyond quota in https://reviews.apache.org/r/42835. Hence the allocator offers rescinded resources to `framework1` if an allocation happens before the test finishes, which violates the expectation that `framework1` receives resources only once. Since we do not really care about allocations in this test but rather about rescinded resources, the fix is just to ignore subsequent offers to `framework1`. Review: https://reviews.apache.org/r/42908/ {noformat} {noformat} commit 56f7e011e925a0e96ae4d9f5e3641422d273624e Author: Alexander Rukletsov Date: Wed Feb 10 13:34:02 2016 -0800 Added missing test finalization. Review: https://reviews.apache.org/r/43422/ {noformat} > MasterQuotaTest.AvailableResourcesAfterRescinding is flaky. > --- > > Key: MESOS-4542 > URL: https://issues.apache.org/jira/browse/MESOS-4542 > Project: Mesos > Issue Type: Bug > Components: allocation, master, test >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov > Labels: flaky-test, mesosphere > Fix For: 0.28.0 > > > Can be reproduced by running {{GLOG_v=1 > GTEST_FILTER="MasterQuotaTest.AvailableResourcesAfterRescinding" > ./bin/mesos-tests.sh --gtest_shuffle --gtest_break_on_failure > --gtest_repeat=1000 --verbose}}. > h5. Verbose log from a bad run: > {code} > [ RUN ] MasterQuotaTest.AvailableResourcesAfterRescinding > I0128 12:20:27.568657 2080858880 resources.cpp:564] Parsing resources as JSON > failed: cpus:2;mem:1024;disk:1024;ports:[31000-32000] > Trying semicolon-delimited string format instead > I0128 12:20:27.570142 2080858880 resources.cpp:564] Parsing resources as JSON > failed: cpus:2;mem:1024;disk:1024;ports:[31000-32000] > Trying semicolon-delimited string format instead > I0128 12:20:27.583225 2080858880 leveldb.cpp:174] Opened db in 6241us > I0128 12:20:27.584353 2080858880 leveldb.cpp:181] Compacted db in 1026us > I0128 12:20:27.584429 2080858880 leveldb.cpp:196] Created db iterator in 12us > I0128 12:20:27.584442 2080858880 leveldb.cpp:202] Seeked to beginning of db > in 7us > I0128 12:20:27.584453 2080858880 leveldb.cpp:271] Iterated through 0 keys in > the db in 6us > I0128 12:20:27.584475 2080858880 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0128 12:20:27.584918 300445696 recover.cpp:447] Starting replica recovery > I0128 12:20:27.585113 300445696 recover.cpp:473] Replica is in EMPTY status > I0128 12:20:27.585916 297226240 replica.cpp:673] Replica in EMPTY status > received a broadcasted recover request from (18274)@192.168.178.24:51278 > I0128 12:20:27.586086 297762816 recover.cpp:193] Received a recover response > from a replica in EMPTY status > I0128 12:20:27.586449 297226240 recover.cpp:564] Updating replica status to > STARTING > I0128 12:20:27.587204 300445696 leveldb.cpp:304] Persisting metadata (8 > bytes) to leveldb took 624us > I0128 12:20:27.587242 300445696 replica.cpp:320] Persisted replica status to > STARTING > I0128 12:20:27.587376 299372544 recover.cpp:473] Replica is in STARTING status > I0128 12:20:27.588050 300982272 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (18275)@192.168.178.24:51278 > I0128 12:20:27.588235 300445696 recover.cpp:193] Received a recover response > from a replica in STARTING status > I0128 12:20:27.588572 297762816 recover.cpp:564] Updating replica status to > VOTING > I0128 12:20:27.588850 297226240 leveldb.cpp:304] Persisting metadata (8 > bytes) to leveldb took 140us > I0128 12:20:27.588879 297226240 replica.cpp:320] Persisted replica status to > VOTING > I0128 12:20:27.588975 299909120 recover.cpp:578] Successfully joined the > Paxos group > I0128 12:20:27.589154 299909120 recover.cpp:462] Recover process terminated > I0128 12:20:27.599486 298835968 master.cpp:374] Master > 531344bd-56f4-4e4f-8f6f-a6a9d36058c7 (alexr.fritz.box) started on > 192.168.178.24:51278 > I0128 12:20:27.599520 298835968 master.cpp:376] Flags at startup: --acls="" > --allocation_interval="50ms" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/private/tmp/NlzPSo/credentials"
[jira] [Assigned] (MESOS-4646) PortMappingIsolatorTests get kernel stuck.
[ https://issues.apache.org/jira/browse/MESOS-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cong Wang reassigned MESOS-4646: Assignee: Cong Wang > PortMappingIsolatorTests get kernel stuck. > -- > > Key: MESOS-4646 > URL: https://issues.apache.org/jira/browse/MESOS-4646 > Project: Mesos > Issue Type: Bug > Environment: Linux Kernel 3.19.9-49-generic, > libnl-3.2.27 >Reporter: Till Toenshoff >Assignee: Cong Wang > > {noformat} > $ sudo ./bin/mesos-tests.sh --gtest_filter="*PortMappingIsolatorTest*" > Source directory: /home/till/scratchpad/mesos > Build directory: /home/till/scratchpad/mesos/build > - > We cannot run any cgroups tests that require mounting > hierarchies because you have the following hierarchies mounted: > /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, > /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, > /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/net_cls, > /sys/fs/cgroup/net_prio, /sys/fs/cgroup/perf_event, /sys/fs/cgroup/systemd > We'll disable the CgroupsNoHierarchyTest test fixture for now. > - > WARNING: perf not found for kernel 3.19.0-49 > You may need to install the following packages for this specific kernel: > linux-tools-3.19.0-49-generic > linux-cloud-tools-3.19.0-49-generic > You may also want to install one of the following packages to keep up to > date: > linux-tools-generic-lts- > linux-cloud-tools-generic-lts- > - > No 'perf' command found so no 'perf' tests will be run > - > WARNING: perf not found for kernel 3.19.0-49 > You may need to install the following packages for this specific kernel: > linux-tools-3.19.0-49-generic > linux-cloud-tools-3.19.0-49-generic > You may also want to install one of the following packages to keep up to > date: > linux-tools-generic-lts- > linux-cloud-tools-generic-lts- > - > The 'perf' command wasn't found so tests using it > to sample the 'cycles' hardware event will not be run. > - > /bin/nc > /usr/local/bin/curl > Note: Google Test filter = >
[jira] [Commented] (MESOS-4607) Docker image create should not return any error with env var
[ https://issues.apache.org/jira/browse/MESOS-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142292#comment-15142292 ] Guangya Liu commented on MESOS-4607: [~jieyu] can you please help shepherd this? Thanks. > Docker image create should not return any error with env var > > > Key: MESOS-4607 > URL: https://issues.apache.org/jira/browse/MESOS-4607 > Project: Mesos > Issue Type: Bug > Components: docker >Reporter: Gilbert Song >Assignee: Guangya Liu >Priority: Minor > > In docker image create behavior, entrypoint and environment variables are > read from docker inspect. Error should not be returned from finding any > wrong-formatted env var, which may possibly block docker containerizer. > Specifically, we may want to just `LOG(WARNING)` for those unexpected env var > (Please see > https://github.com/apache/mesos/blob/master/src/docker/docker.cpp#L388~#L395). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4641) Support Container Network Interface (CNI).
[ https://issues.apache.org/jira/browse/MESOS-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142278#comment-15142278 ] Mike Spreitzer commented on MESOS-4641: --- The Kuryr project in OpenStack is looking at this too. One approach they are considering is creating a CNI plugin that invokes the Docker CLI. I think this is the right approach, for those of us that want to use Neutron, and am prototyping it myself. > Support Container Network Interface (CNI). > -- > > Key: MESOS-4641 > URL: https://issues.apache.org/jira/browse/MESOS-4641 > Project: Mesos > Issue Type: Epic >Reporter: Jie Yu >Assignee: Qian Zhang > > CoreOS developed the Container Network Interface (CNI), a proposed standard > for configuring network interfaces for Linux containers. Many CNI plugins > (e.g., calico) have already been developed. > https://coreos.com/blog/rkt-cni-networking.html > https://github.com/appc/cni/blob/master/SPEC.md > Kubernetes supports CNI as well. > http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html > In the context of Unified Containerizer, it would be nice if we can have a > 'network/cni' isolator which will speak the CNI protocol and prepare the > network for the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4641) Support Container Network Interface (CNI).
[ https://issues.apache.org/jira/browse/MESOS-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-4641: -- Description: CoreOS developed the Container Network Interface (CNI), a proposed standard for configuring network interfaces for Linux containers. Many CNI plugins (e.g., calico) have already been developed. https://coreos.com/blog/rkt-cni-networking.html https://github.com/appc/cni/blob/master/SPEC.md Kubernetes supports CNI as well. http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html In the context of Unified Containerizer, it would be nice if we can have a 'network/cni' isolator which will speak the CNI protocol and prepare the network for the container. was: CoreOS developed the Container Network Interface (CNI), a proposed standard for configuring network interfaces for Linux containers. Many CNI plugins (e.g., calico) have already been developed. https://coreos.com/blog/rkt-cni-networking.html https://github.com/appc/cni/blob/master/SPEC.md Also, Kubernetes claimed that they'll support CNI as well. http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html In the context of Unified Containerizer, it would be nice if we can have a 'network/cni' isolator which will speak the CNI protocol and prepare the network for the container. > Support Container Network Interface (CNI). > -- > > Key: MESOS-4641 > URL: https://issues.apache.org/jira/browse/MESOS-4641 > Project: Mesos > Issue Type: Epic >Reporter: Jie Yu >Assignee: Qian Zhang > > CoreOS developed the Container Network Interface (CNI), a proposed standard > for configuring network interfaces for Linux containers. Many CNI plugins > (e.g., calico) have already been developed. > https://coreos.com/blog/rkt-cni-networking.html > https://github.com/appc/cni/blob/master/SPEC.md > Kubernetes supports CNI as well. > http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html > In the context of Unified Containerizer, it would be nice if we can have a > 'network/cni' isolator which will speak the CNI protocol and prepare the > network for the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4646) PortMappingIsolatorTests get kernel stuck.
[ https://issues.apache.org/jira/browse/MESOS-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142231#comment-15142231 ] Cong Wang commented on MESOS-4646: -- This might be the bug I have already fixed: {noformat} commit 6bd00b850635abb0044e06101761533c8beba79c Author: WANG CongDate: Thu Oct 1 11:37:42 2015 -0700 act_mirred: fix a race condition on mirred_list {noformat} Could you try to setup kdump to capture the kernel stack trace? Or try a new kernel, 4.3 or above. > PortMappingIsolatorTests get kernel stuck. > -- > > Key: MESOS-4646 > URL: https://issues.apache.org/jira/browse/MESOS-4646 > Project: Mesos > Issue Type: Bug > Environment: Linux Kernel 3.19.9-49-generic, > libnl-3.2.27 >Reporter: Till Toenshoff > > {noformat} > $ sudo ./bin/mesos-tests.sh --gtest_filter="*PortMappingIsolatorTest*" > Source directory: /home/till/scratchpad/mesos > Build directory: /home/till/scratchpad/mesos/build > - > We cannot run any cgroups tests that require mounting > hierarchies because you have the following hierarchies mounted: > /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, > /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, > /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/net_cls, > /sys/fs/cgroup/net_prio, /sys/fs/cgroup/perf_event, /sys/fs/cgroup/systemd > We'll disable the CgroupsNoHierarchyTest test fixture for now. > - > WARNING: perf not found for kernel 3.19.0-49 > You may need to install the following packages for this specific kernel: > linux-tools-3.19.0-49-generic > linux-cloud-tools-3.19.0-49-generic > You may also want to install one of the following packages to keep up to > date: > linux-tools-generic-lts- > linux-cloud-tools-generic-lts- > - > No 'perf' command found so no 'perf' tests will be run > - > WARNING: perf not found for kernel 3.19.0-49 > You may need to install the following packages for this specific kernel: > linux-tools-3.19.0-49-generic > linux-cloud-tools-3.19.0-49-generic > You may also want to install one of the following packages to keep up to > date: > linux-tools-generic-lts- > linux-cloud-tools-generic-lts- > - > The 'perf' command wasn't found so tests using it > to sample the 'cycles' hardware event will not be run. > - > /bin/nc > /usr/local/bin/curl > Note: Google Test filter = >
[jira] [Created] (MESOS-4648) Backport zookeeper slow add_auth patch
haosdent created MESOS-4648: --- Summary: Backport zookeeper slow add_auth patch Key: MESOS-4648 URL: https://issues.apache.org/jira/browse/MESOS-4648 Project: Mesos Issue Type: Improvement Reporter: haosdent Assignee: haosdent Priority: Minor Backport [ZOOKEEPER-770 Slow add_auth calls with multi-threaded client|https://issues.apache.org/jira/browse/ZOOKEEPER-770] to solve c client slow add_auth call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3833) /help endpoints do not work for nested paths
[ https://issues.apache.org/jira/browse/MESOS-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991263#comment-14991263 ] Guangya Liu edited comment on MESOS-3833 at 2/11/16 5:38 AM: - RR: https://reviews.apache.org/r/39968/ https://reviews.apache.org/r/43469/ was (Author: gyliu): RR: https://reviews.apache.org/r/39968/ > /help endpoints do not work for nested paths > > > Key: MESOS-3833 > URL: https://issues.apache.org/jira/browse/MESOS-3833 > Project: Mesos > Issue Type: Bug > Components: HTTP API >Reporter: Anand Mazumdar >Assignee: Guangya Liu >Priority: Minor > Labels: mesosphere, newbie > > Mesos displays the list of all supported endpoints starting at a given path > prefix using the {{/help}} suffix, e.g. {{master:5050/help}}. > It seems that the {{help}} functionality is broken for URL's having nested > paths e.g. {{master:5050/help/master/machine/down}}. The response returned is: > {quote} > Malformed URL, expecting '/help/id/name/' > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3833) /help endpoints do not work for nested paths
[ https://issues.apache.org/jira/browse/MESOS-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140592#comment-15140592 ] Benjamin Mahler commented on MESOS-3833: Yes sorry for the delay, [~gyliu] please email me at bmah...@apache.org when you need reviews :) Just gave you a review, let me know when you've updated! > /help endpoints do not work for nested paths > > > Key: MESOS-3833 > URL: https://issues.apache.org/jira/browse/MESOS-3833 > Project: Mesos > Issue Type: Bug > Components: HTTP API >Reporter: Anand Mazumdar >Assignee: Guangya Liu >Priority: Minor > Labels: mesosphere, newbie > > Mesos displays the list of all supported endpoints starting at a given path > prefix using the {{/help}} suffix, e.g. {{master:5050/help}}. > It seems that the {{help}} functionality is broken for URL's having nested > paths e.g. {{master:5050/help/master/machine/down}}. The response returned is: > {quote} > Malformed URL, expecting '/help/id/name/' > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4612) Update vendored ZooKeeper
[ https://issues.apache.org/jira/browse/MESOS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140655#comment-15140655 ] haosdent commented on MESOS-4612: - Thank you very much, I saw 3.4.8-rc0 already under [voting|http://search-hadoop.com/m/JhBoa1vFuw116H8BC1]. I think we don't wait too long for 3.4.8 > Update vendored ZooKeeper > - > > Key: MESOS-4612 > URL: https://issues.apache.org/jira/browse/MESOS-4612 > Project: Mesos > Issue Type: Improvement >Reporter: Cody Maloney >Assignee: haosdent > Labels: mesosphere, tech-debt, zookeeper > > See: http://zookeeper.apache.org/doc/r3.4.7/releasenotes.html for > improvements / bug fixes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4612) Update vendored ZooKeeper to 3.4.8
[ https://issues.apache.org/jira/browse/MESOS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent updated MESOS-4612: Summary: Update vendored ZooKeeper to 3.4.8 (was: Update vendored ZooKeeper) > Update vendored ZooKeeper to 3.4.8 > -- > > Key: MESOS-4612 > URL: https://issues.apache.org/jira/browse/MESOS-4612 > Project: Mesos > Issue Type: Improvement >Reporter: Cody Maloney >Assignee: haosdent > Labels: mesosphere, tech-debt, zookeeper > > See: http://zookeeper.apache.org/doc/r3.4.7/releasenotes.html for > improvements / bug fixes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4641) Support Container Network Interface (CNI).
[ https://issues.apache.org/jira/browse/MESOS-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142197#comment-15142197 ] Mike Spreitzer commented on MESOS-4641: --- Kubernetes already does support CNI. > Support Container Network Interface (CNI). > -- > > Key: MESOS-4641 > URL: https://issues.apache.org/jira/browse/MESOS-4641 > Project: Mesos > Issue Type: Epic >Reporter: Jie Yu >Assignee: Qian Zhang > > CoreOS developed the Container Network Interface (CNI), a proposed standard > for configuring network interfaces for Linux containers. Many CNI plugins > (e.g., calico) have already been developed. > https://coreos.com/blog/rkt-cni-networking.html > https://github.com/appc/cni/blob/master/SPEC.md > Also, Kubernetes claimed that they'll support CNI as well. > http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html > In the context of Unified Containerizer, it would be nice if we can have a > 'network/cni' isolator which will speak the CNI protocol and prepare the > network for the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4643) PortMappingIsolatorTest fail when no namespaces are set.
Till Toenshoff created MESOS-4643: - Summary: PortMappingIsolatorTest fail when no namespaces are set. Key: MESOS-4643 URL: https://issues.apache.org/jira/browse/MESOS-4643 Project: Mesos Issue Type: Bug Environment: Linux Kernel 3.19.0-49-generic, libnl-3.2.27 Reporter: Till Toenshoff Priority: Minor Currently our network isolator tests fail with the following output on a Ubuntu 14.04 VM. {noformat} [02:10:15][Step 8/8] [ RUN ] PortMappingIsolatorTest.ROOT_NC_ContainerToContainerTCP [02:10:15][Step 8/8] ../../src/tests/containerizer/port_mapping_tests.cpp:164: Failure [02:10:15][Step 8/8] entries: Failed to opendir '/var/run/netns': No such file or directory [02:10:15][Step 8/8] ../../src/tests/containerizer/port_mapping_tests.cpp:164: Failure [02:10:15][Step 8/8] entries: Failed to opendir '/var/run/netns': No such file or directory [02:10:15][Step 8/8] [ FAILED ] PortMappingIsolatorTest.ROOT_NC_ContainerToContainerTCP (4 ms) {noformat} The machine has no network namespaces set, hence {{/var/run/netns}} does not exist. We should help users understanding this prerequisite or maybe even get these things in a fixture. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4644) PortMappingIsolatorTest* crashes when ethtool is not installed.
Till Toenshoff created MESOS-4644: - Summary: PortMappingIsolatorTest* crashes when ethtool is not installed. Key: MESOS-4644 URL: https://issues.apache.org/jira/browse/MESOS-4644 Project: Mesos Issue Type: Improvement Reporter: Till Toenshoff Priority: Minor {noformat} [ RUN ] PortMappingIsolatorTest.ROOT_NC_ContainerToContainerTCP sh: 1: ethtool: not found F0210 15:45:23.543251 3956 port_mapping_tests.cpp:441] CHECK_SOME(isolator): Check command 'ethtool' failed: Failed to execute 'ethtool -k lo'; the command was either not found or exited with a non-zero exit status: 127 *** Check failure stack trace: *** @ 0x7fb3b0642a1c google::LogMessage::Fail() @ 0x7fb3b0642968 google::LogMessage::SendToLog() @ 0x7fb3b064236a google::LogMessage::Flush() @ 0x7fb3b064527e google::LogMessageFatal::~LogMessageFatal() @ 0x939020 _CheckFatal::~_CheckFatal() @ 0x1524fc4 mesos::internal::tests::PortMappingIsolatorTest_ROOT_NC_ContainerToContainerTCP_Test::TestBody() @ 0x15ad006 testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x15a7f26 testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x158963b testing::Test::Run() @ 0x1589dbe testing::TestInfo::Run() @ 0x158a404 testing::TestCase::Run() @ 0x1590b4c testing::internal::UnitTestImpl::RunAllTests() @ 0x15adc2b testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x15a8ad2 testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x158f8e8 testing::UnitTest::Run() @ 0xd6e65c RUN_ALL_TESTS() @ 0xd6e281 main @ 0x7fb3aae13ec5 (unknown) @ 0x937669 (unknown) {noformat} We might want to consider adding a test for the availability of {{ethtool}} into {{src/tests/containerizer/port_mapping_tests.cpp -- PortMappingIsolatorTest::SetUpTestCase}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4646) PortMappingIsolatorTests get stuck.
Till Toenshoff created MESOS-4646: - Summary: PortMappingIsolatorTests get stuck. Key: MESOS-4646 URL: https://issues.apache.org/jira/browse/MESOS-4646 Project: Mesos Issue Type: Bug Environment: Linux Kernel 3.19.9-49-generic, libnl-3.2.27 Reporter: Till Toenshoff {noformat} $ sudo ./bin/mesos-tests.sh --gtest_filter="*PortMappingIsolatorTest*" Source directory: /home/till/scratchpad/mesos Build directory: /home/till/scratchpad/mesos/build - We cannot run any cgroups tests that require mounting hierarchies because you have the following hierarchies mounted: /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/net_cls, /sys/fs/cgroup/net_prio, /sys/fs/cgroup/perf_event, /sys/fs/cgroup/systemd We'll disable the CgroupsNoHierarchyTest test fixture for now. - WARNING: perf not found for kernel 3.19.0-49 You may need to install the following packages for this specific kernel: linux-tools-3.19.0-49-generic linux-cloud-tools-3.19.0-49-generic You may also want to install one of the following packages to keep up to date: linux-tools-generic-lts- linux-cloud-tools-generic-lts- - No 'perf' command found so no 'perf' tests will be run - WARNING: perf not found for kernel 3.19.0-49 You may need to install the following packages for this specific kernel: linux-tools-3.19.0-49-generic linux-cloud-tools-3.19.0-49-generic You may also want to install one of the following packages to keep up to date: linux-tools-generic-lts- linux-cloud-tools-generic-lts- - The 'perf' command wasn't found so tests using it to sample the 'cycles' hardware event will not be run. - /bin/nc /usr/local/bin/curl Note: Google Test filter =
[jira] [Assigned] (MESOS-4641) Support Container Network Interface (CNI).
[ https://issues.apache.org/jira/browse/MESOS-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qian Zhang reassigned MESOS-4641: - Assignee: Qian Zhang > Support Container Network Interface (CNI). > -- > > Key: MESOS-4641 > URL: https://issues.apache.org/jira/browse/MESOS-4641 > Project: Mesos > Issue Type: Epic >Reporter: Jie Yu >Assignee: Qian Zhang > > CoreOS developed the Container Network Interface (CNI), a proposed standard > for configuring network interfaces for Linux containers. Many CNI plugins > (e.g., calico) have already been developed. > https://coreos.com/blog/rkt-cni-networking.html > https://github.com/appc/cni/blob/master/SPEC.md > Also, Kubernetes claimed that they'll support CNI as well. > http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html > In the context of Unified Containerizer, it would be nice if we can have a > 'network/cni' isolator which will speak the CNI protocol and prepare the > network for the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4641) Support Container Network Interface (CNI).
[ https://issues.apache.org/jira/browse/MESOS-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qian Zhang updated MESOS-4641: -- Shepherd: Jie Yu > Support Container Network Interface (CNI). > -- > > Key: MESOS-4641 > URL: https://issues.apache.org/jira/browse/MESOS-4641 > Project: Mesos > Issue Type: Epic >Reporter: Jie Yu >Assignee: Qian Zhang > > CoreOS developed the Container Network Interface (CNI), a proposed standard > for configuring network interfaces for Linux containers. Many CNI plugins > (e.g., calico) have already been developed. > https://coreos.com/blog/rkt-cni-networking.html > https://github.com/appc/cni/blob/master/SPEC.md > Also, Kubernetes claimed that they'll support CNI as well. > http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html > In the context of Unified Containerizer, it would be nice if we can have a > 'network/cni' isolator which will speak the CNI protocol and prepare the > network for the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4646) PortMappingIsolatorTests get kernel stuck.
[ https://issues.apache.org/jira/browse/MESOS-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-4646: -- Summary: PortMappingIsolatorTests get kernel stuck. (was: PortMappingIsolatorTests get stuck.) > PortMappingIsolatorTests get kernel stuck. > -- > > Key: MESOS-4646 > URL: https://issues.apache.org/jira/browse/MESOS-4646 > Project: Mesos > Issue Type: Bug > Environment: Linux Kernel 3.19.9-49-generic, > libnl-3.2.27 >Reporter: Till Toenshoff > > {noformat} > $ sudo ./bin/mesos-tests.sh --gtest_filter="*PortMappingIsolatorTest*" > Source directory: /home/till/scratchpad/mesos > Build directory: /home/till/scratchpad/mesos/build > - > We cannot run any cgroups tests that require mounting > hierarchies because you have the following hierarchies mounted: > /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, > /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, > /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/net_cls, > /sys/fs/cgroup/net_prio, /sys/fs/cgroup/perf_event, /sys/fs/cgroup/systemd > We'll disable the CgroupsNoHierarchyTest test fixture for now. > - > WARNING: perf not found for kernel 3.19.0-49 > You may need to install the following packages for this specific kernel: > linux-tools-3.19.0-49-generic > linux-cloud-tools-3.19.0-49-generic > You may also want to install one of the following packages to keep up to > date: > linux-tools-generic-lts- > linux-cloud-tools-generic-lts- > - > No 'perf' command found so no 'perf' tests will be run > - > WARNING: perf not found for kernel 3.19.0-49 > You may need to install the following packages for this specific kernel: > linux-tools-3.19.0-49-generic > linux-cloud-tools-3.19.0-49-generic > You may also want to install one of the following packages to keep up to > date: > linux-tools-generic-lts- > linux-cloud-tools-generic-lts- > - > The 'perf' command wasn't found so tests using it > to sample the 'cycles' hardware event will not be run. > - > /bin/nc > /usr/local/bin/curl > Note: Google Test filter = >
[jira] [Commented] (MESOS-4636) Add parent hook to subprocess.
[ https://issues.apache.org/jira/browse/MESOS-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142074#comment-15142074 ] Michael Park commented on MESOS-4636: - {noformat} commit f2a71af11eb3af6d8d329742962f37a907d9967e Author: Joseph WuDate: Wed Feb 10 16:40:02 2016 -0800 Fix CGROUPS_ROOT_* tests on systemd platforms. Tests do not run with systemd configured, so any dependency on systemd will fail some checks. This fixes the `LinuxLauncher` to use the correct systemd-guard function. Review: https://reviews.apache.org/r/43432/ {noformat} > Add parent hook to subprocess. > -- > > Key: MESOS-4636 > URL: https://issues.apache.org/jira/browse/MESOS-4636 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Joris Van Remoortere >Assignee: Joris Van Remoortere > Labels: mesosphere > Fix For: 0.27.1, 0.28.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4645) Mesos agent shutdown on healtcheck timeout rather than lost and recovered
Cody Maloney created MESOS-4645: --- Summary: Mesos agent shutdown on healtcheck timeout rather than lost and recovered Key: MESOS-4645 URL: https://issues.apache.org/jira/browse/MESOS-4645 Project: Mesos Issue Type: Bug Affects Versions: 0.27.1 Reporter: Cody Maloney I expected slaves to have to be gone the re-registration timeout before they'd be lost to the cluster, not fail 5 healtchecks (Failing the healthchecks indicates there is a network partition, not that the agent is gone for good and will never come back). Is there some flag I'm missing here which I should be setting? >From my perspective I expect frameworks to not get offers for resources on >agents which haven't been contacted recently (The framework wouldn't be able >to launch anything on the agent). Once the re-registration period times out >the slave would be assumed completely lost and the tasks assumed terminated / >able to be re-launched if desired. If an agent recovers between the >healthcheck timeout and re-registration timeout, it should be able to re-join >the cluster with its running tasks kept running. Note: Some log lines have their start or tail truncated. Critical stuff should all be there Master flags {noformat} Feb 11 00:22:19 ip-10-0-4-187.us-west-2.compute.internal mesos-master[1362]: I0211 00:22:19.690507 1362 master.cpp:369] Flags at startup: --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="false" --authenticate_slaves="false" --authenticators="crammd5" --authorizers="local" --cluster="cody-cm52sd-2" --framework_sorter="drf" --help="false" --hostname_lookup="false" --initialize_driver_logging="true" --ip_discovery_command="/opt/mesosphere/bin/detect_ip" --log_auto_initialize="true" --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --port="5050" --quiet="false" --quorum="1" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" --registry_strict="false" --roles="slave_public" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/opt/mesosphere/packages/mesos--4dd59ec6bde2052f6f2a0a0da415b6c92c3c418a/share/mesos/webui" --weights="slave_public=1" --work_dir="/var/lib/mesos/master" --zk="zk://127.0.0.1:2181/mesos" --zk_session_timeout="10secs" {noformat} Slave flags {noformat} Feb 11 00:34:13 ip-10-0-0-52.us-west-2.compute.internal mesos-slave[3914]: I0211 00:34:13.334395 3914 slave.cpp:192] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_auth_server="auth.docker.io" --docker_auth_server_port="443" --docker_kill_orphans="true" --docker_local_archives_dir="/tmp/mesos/images/docker" --docker_puller="local" --docker_puller_timeout="60" --docker_registry="registry-1.docker.io" --docker_registry_port="443" --docker_remove_delay="1hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --enforce_container_disk_quota="false" --executor_environment_variables="{"LD_LIBRARY_PATH":"\/opt\/mesosphere\/lib","PATH":"\/usr\/bin:\/bin","SASL_PATH":"\/opt\/mesosphere\/lib\/sasl2","SHELL":"\/usr\/bin\/bash"}" --executor_registration_timeout="5mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="2days" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="false" --image_provisioner_backend="copy" --initialize_driver_logging="true" --ip_discovery_command="/opt/mesosphere/bin/detect_ip" --isolation="cgroups/cpu,cgroups/mem" --launcher_dir="/opt/mesosphere/packages/mesos--4dd59ec6bde2052f6f2a0a0da415b6c92c3c418a/libexec/mesos" --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" --master="zk://leader.mesos:2181/mesos" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --resources="ports:[1025-2180,2182-3887,3889-5049,5052-8079,8082-8180,8182-32000]" --re Feb 11 00:34:13 ip-10-0-0-52.us-west-2.compute.internal mesos-slave[3914]: vocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --slave_subsystems="cpu,memory" --strict="true" --switch_user="true"
[jira] [Commented] (MESOS-4431) Sharing of persistent volumes via reference counting
[ https://issues.apache.org/jira/browse/MESOS-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142117#comment-15142117 ] Klaus Ma commented on MESOS-4431: - [~anindya.sinha], would you spit those RR into smaller tasks and patches? Please refer to other EPIC on how to split them, e.g. MESOS-1719 :). > Sharing of persistent volumes via reference counting > > > Key: MESOS-4431 > URL: https://issues.apache.org/jira/browse/MESOS-4431 > Project: Mesos > Issue Type: Improvement > Components: general >Affects Versions: 0.25.0 >Reporter: Anindya Sinha >Assignee: Anindya Sinha > Labels: external-volumes, persistent-volumes > > Add capability for specific resources to be shared amongst tasks within or > across frameworks/roles. Enable this functionality for persistent volumes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4647) Use in_memory as default registry when testing
haosdent created MESOS-4647: --- Summary: Use in_memory as default registry when testing Key: MESOS-4647 URL: https://issues.apache.org/jira/browse/MESOS-4647 Project: Mesos Issue Type: Improvement Reporter: haosdent Assignee: haosdent Currently, we use {{replicated_log}} as default registry when testing. This cause io operations when testings and slow down test cases. We should change it to use {{in_memory}} when testing and only use {{replicated_log}} when necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)