[jira] [Commented] (MESOS-3243) Replace NULL with nullptr
[ https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184618#comment-15184618 ] Tomasz Janiszewski commented on MESOS-3243: --- I can work on this > Replace NULL with nullptr > - > > Key: MESOS-3243 > URL: https://issues.apache.org/jira/browse/MESOS-3243 > Project: Mesos > Issue Type: Bug >Reporter: Michael Park > > As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over > to use {{nullptr}}. I think it would be an interesting exercise to do this > with {{clang-modernize}} using the [nullptr > transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although > it's probably just as easy to use {{sed}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4897) Update test cases to support PowerPC LE
Chen Zhiwei created MESOS-4897: -- Summary: Update test cases to support PowerPC LE Key: MESOS-4897 URL: https://issues.apache.org/jira/browse/MESOS-4897 Project: Mesos Issue Type: Improvement Reporter: Chen Zhiwei Assignee: Chen Zhiwei Some docker related test cases will be failed on PowerPC LE, since the Docker image 'alpine' can't be able to run on PowerPC LE platform. On PowerPC LE platform, the test cases can use Docker image 'ppc64le/busybox'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4848) Agent Authn Research Spike
[ https://issues.apache.org/jira/browse/MESOS-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184458#comment-15184458 ] Adam B commented on MESOS-4848: --- Looks great! I had a couple of questions that I left as comments in the doc, most importantly about the integration of authenticator modules. > Agent Authn Research Spike > -- > > Key: MESOS-4848 > URL: https://issues.apache.org/jira/browse/MESOS-4848 > Project: Mesos > Issue Type: Task > Components: security, slave >Reporter: Adam B >Assignee: Greg Mann > Labels: mesosphere, security > > Research the master authentication flags to see what changes will be > necessary for agent http authentication. > Write up a 1-2 page summary/design doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4739) libprocess CHECK failure in SlaveRecoveryTest/0.ReconnectHTTPExecutor
[ https://issues.apache.org/jira/browse/MESOS-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184410#comment-15184410 ] Neil Conway commented on MESOS-4739: Stress is http://people.seas.harvard.edu/~apw/stress/ -- i.e., just a workload generator that consumes a lot of CPU. > libprocess CHECK failure in SlaveRecoveryTest/0.ReconnectHTTPExecutor > - > > Key: MESOS-4739 > URL: https://issues.apache.org/jira/browse/MESOS-4739 > Project: Mesos > Issue Type: Bug > Components: HTTP API, libprocess >Reporter: Neil Conway > Labels: flaky-test, libprocess, mesosphere > > Showed up on ASF CI: > https://builds.apache.org/job/Mesos/COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu:14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/1704/consoleFull > {code} > [ RUN ] SlaveRecoveryTest/0.ReconnectHTTPExecutor > I0223 04:54:28.547051 786 leveldb.cpp:174] Opened db in 124.456584ms > I0223 04:54:28.597709 786 leveldb.cpp:181] Compacted db in 50.603402ms > I0223 04:54:28.597779 786 leveldb.cpp:196] Created db iterator in 22429ns > I0223 04:54:28.597797 786 leveldb.cpp:202] Seeked to beginning of db in > 2279ns > I0223 04:54:28.597810 786 leveldb.cpp:271] Iterated through 0 keys in the > db in 265ns > I0223 04:54:28.597859 786 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0223 04:54:28.598731 807 recover.cpp:447] Starting replica recovery > I0223 04:54:28.599493 807 recover.cpp:473] Replica is in EMPTY status > I0223 04:54:28.601400 815 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (9593)@172.17.0.2:44225 > I0223 04:54:28.601776 818 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0223 04:54:28.602247 809 recover.cpp:564] Updating replica status to > STARTING > I0223 04:54:28.603353 811 master.cpp:376] Master > 81a295fc-fe1b-4ff8-9291-cd54f5c6f303 (5847d87ad902) started on > 172.17.0.2:44225 > I0223 04:54:28.603376 811 master.cpp:378] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/f6d1qA/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="100secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.28.0/_inst/share/mesos/webui" > --work_dir="/tmp/f6d1qA/master" --zk_session_timeout="10secs" > I0223 04:54:28.603906 811 master.cpp:423] Master only allowing > authenticated frameworks to register > I0223 04:54:28.603920 811 master.cpp:428] Master only allowing > authenticated slaves to register > I0223 04:54:28.603930 811 credentials.hpp:35] Loading credentials for > authentication from '/tmp/f6d1qA/credentials' > I0223 04:54:28.604317 811 master.cpp:468] Using default 'crammd5' > authenticator > I0223 04:54:28.604506 811 master.cpp:537] Using default 'basic' HTTP > authenticator > I0223 04:54:28.604635 811 master.cpp:571] Authorization enabled > I0223 04:54:28.604918 808 whitelist_watcher.cpp:77] No whitelist given > I0223 04:54:28.605023 819 hierarchical.cpp:144] Initialized hierarchical > allocator process > I0223 04:54:28.608273 812 master.cpp:1712] The newly elected leader is > master@172.17.0.2:44225 with id 81a295fc-fe1b-4ff8-9291-cd54f5c6f303 > I0223 04:54:28.608314 812 master.cpp:1725] Elected as the leading master! > I0223 04:54:28.608333 812 master.cpp:1470] Recovering from registrar > I0223 04:54:28.608610 812 registrar.cpp:307] Recovering registrar > I0223 04:54:28.631079 817 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 28.524027ms > I0223 04:54:28.631156 817 replica.cpp:320] Persisted replica status to > STARTING > I0223 04:54:28.631431 810 recover.cpp:473] Replica is in STARTING status > I0223 04:54:28.632550 819 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (9595)@172.17.0.2:44225 > I0223 04:54:28.632968 816 recover.cpp:193] Received a recover response from > a replica in STARTING status > I0223 04:54:28.633414 807
[jira] [Commented] (MESOS-4739) libprocess CHECK failure in SlaveRecoveryTest/0.ReconnectHTTPExecutor
[ https://issues.apache.org/jira/browse/MESOS-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184366#comment-15184366 ] haosdent commented on MESOS-4739: - hi, [~neilc] What's {{stress --cpu 4}}? Seems gtest don't have a parameter like this. > libprocess CHECK failure in SlaveRecoveryTest/0.ReconnectHTTPExecutor > - > > Key: MESOS-4739 > URL: https://issues.apache.org/jira/browse/MESOS-4739 > Project: Mesos > Issue Type: Bug > Components: HTTP API, libprocess >Reporter: Neil Conway > Labels: flaky-test, libprocess, mesosphere > > Showed up on ASF CI: > https://builds.apache.org/job/Mesos/COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu:14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/1704/consoleFull > {code} > [ RUN ] SlaveRecoveryTest/0.ReconnectHTTPExecutor > I0223 04:54:28.547051 786 leveldb.cpp:174] Opened db in 124.456584ms > I0223 04:54:28.597709 786 leveldb.cpp:181] Compacted db in 50.603402ms > I0223 04:54:28.597779 786 leveldb.cpp:196] Created db iterator in 22429ns > I0223 04:54:28.597797 786 leveldb.cpp:202] Seeked to beginning of db in > 2279ns > I0223 04:54:28.597810 786 leveldb.cpp:271] Iterated through 0 keys in the > db in 265ns > I0223 04:54:28.597859 786 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0223 04:54:28.598731 807 recover.cpp:447] Starting replica recovery > I0223 04:54:28.599493 807 recover.cpp:473] Replica is in EMPTY status > I0223 04:54:28.601400 815 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (9593)@172.17.0.2:44225 > I0223 04:54:28.601776 818 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0223 04:54:28.602247 809 recover.cpp:564] Updating replica status to > STARTING > I0223 04:54:28.603353 811 master.cpp:376] Master > 81a295fc-fe1b-4ff8-9291-cd54f5c6f303 (5847d87ad902) started on > 172.17.0.2:44225 > I0223 04:54:28.603376 811 master.cpp:378] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/f6d1qA/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="100secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.28.0/_inst/share/mesos/webui" > --work_dir="/tmp/f6d1qA/master" --zk_session_timeout="10secs" > I0223 04:54:28.603906 811 master.cpp:423] Master only allowing > authenticated frameworks to register > I0223 04:54:28.603920 811 master.cpp:428] Master only allowing > authenticated slaves to register > I0223 04:54:28.603930 811 credentials.hpp:35] Loading credentials for > authentication from '/tmp/f6d1qA/credentials' > I0223 04:54:28.604317 811 master.cpp:468] Using default 'crammd5' > authenticator > I0223 04:54:28.604506 811 master.cpp:537] Using default 'basic' HTTP > authenticator > I0223 04:54:28.604635 811 master.cpp:571] Authorization enabled > I0223 04:54:28.604918 808 whitelist_watcher.cpp:77] No whitelist given > I0223 04:54:28.605023 819 hierarchical.cpp:144] Initialized hierarchical > allocator process > I0223 04:54:28.608273 812 master.cpp:1712] The newly elected leader is > master@172.17.0.2:44225 with id 81a295fc-fe1b-4ff8-9291-cd54f5c6f303 > I0223 04:54:28.608314 812 master.cpp:1725] Elected as the leading master! > I0223 04:54:28.608333 812 master.cpp:1470] Recovering from registrar > I0223 04:54:28.608610 812 registrar.cpp:307] Recovering registrar > I0223 04:54:28.631079 817 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 28.524027ms > I0223 04:54:28.631156 817 replica.cpp:320] Persisted replica status to > STARTING > I0223 04:54:28.631431 810 recover.cpp:473] Replica is in STARTING status > I0223 04:54:28.632550 819 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (9595)@172.17.0.2:44225 > I0223 04:54:28.632968 816 recover.cpp:193] Received a recover response from > a replica in STARTING status > I0223 04:54:28.633414 807 recover.cpp:564] Updating replica status to VOTING
[jira] [Commented] (MESOS-4800) SlaveRecoveryTest/0.RecoverTerminatedExecutor is flaky
[ https://issues.apache.org/jira/browse/MESOS-4800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184329#comment-15184329 ] haosdent commented on MESOS-4800: - We have two approaches to fix this. One is add a check in {code} void StatusUpdateManagerProcess::resume() { LOG(INFO) << "Resuming sending status updates"; paused = false; {code} to avoid to resume StatusUpdateManagerProcess which is running. Another is allow receive status update multiple times in test cases. {code} EXPECT_CALL(sched, statusUpdate(_, _)) .WillOnce(FutureArg<1>()); {code} > SlaveRecoveryTest/0.RecoverTerminatedExecutor is flaky > -- > > Key: MESOS-4800 > URL: https://issues.apache.org/jira/browse/MESOS-4800 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar > Labels: flaky, flaky-test, mesosphere > > Showed up on ASF CI: > https://builds.apache.org/job/Mesos/COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/1743/changes > {code} > [ RUN ] SlaveRecoveryTest/0.RecoverTerminatedExecutor > I0229 02:11:01.321990 2124 leveldb.cpp:174] Opened db in 121.848194ms > I0229 02:11:01.363880 2124 leveldb.cpp:181] Compacted db in 41.823665ms > I0229 02:11:01.363965 2124 leveldb.cpp:196] Created db iterator in 27127ns > I0229 02:11:01.363984 2124 leveldb.cpp:202] Seeked to beginning of db in > 3446ns > I0229 02:11:01.363996 2124 leveldb.cpp:271] Iterated through 0 keys in the > db in 332ns > I0229 02:11:01.364050 2124 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0229 02:11:01.365196 2158 recover.cpp:447] Starting replica recovery > I0229 02:11:01.365492 2158 recover.cpp:473] Replica is in EMPTY status > I0229 02:11:01.366982 2151 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (9830)@172.17.0.3:36786 > I0229 02:11:01.367451 2149 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0229 02:11:01.368335 2149 recover.cpp:564] Updating replica status to > STARTING > I0229 02:11:01.372730 2158 master.cpp:375] Master > d551df7b-0c69-4bc9-b113-eca605384c49 (3036a6611147) started on > 172.17.0.3:36786 > I0229 02:11:01.372764 2158 master.cpp:377] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/e9RAjp/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="100secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.28.0/_inst/share/mesos/webui" > --work_dir="/tmp/e9RAjp/master" --zk_session_timeout="10secs" > I0229 02:11:01.373164 2158 master.cpp:422] Master only allowing > authenticated frameworks to register > I0229 02:11:01.373178 2158 master.cpp:427] Master only allowing > authenticated slaves to register > I0229 02:11:01.373188 2158 credentials.hpp:35] Loading credentials for > authentication from '/tmp/e9RAjp/credentials' > I0229 02:11:01.373612 2158 master.cpp:467] Using default 'crammd5' > authenticator > I0229 02:11:01.373793 2158 master.cpp:536] Using default 'basic' HTTP > authenticator > I0229 02:11:01.373919 2158 master.cpp:570] Authorization enabled > I0229 02:11:01.376322 2153 whitelist_watcher.cpp:77] No whitelist given > I0229 02:11:01.376456 2158 hierarchical.cpp:144] Initialized hierarchical > allocator process > I0229 02:11:01.378609 2144 master.cpp:1711] The newly elected leader is > master@172.17.0.3:36786 with id d551df7b-0c69-4bc9-b113-eca605384c49 > I0229 02:11:01.378674 2144 master.cpp:1724] Elected as the leading master! > I0229 02:11:01.378700 2144 master.cpp:1469] Recovering from registrar > I0229 02:11:01.378880 2154 registrar.cpp:307] Recovering registrar > I0229 02:11:01.413949 2149 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 45.305096ms > I0229 02:11:01.414049 2149 replica.cpp:320] Persisted replica status to > STARTING > I0229 02:11:01.414481 2154 recover.cpp:473] Replica is in STARTING status > I0229 02:11:01.416136 2154 replica.cpp:673] Replica in
[jira] [Commented] (MESOS-4800) SlaveRecoveryTest/0.RecoverTerminatedExecutor is flaky
[ https://issues.apache.org/jira/browse/MESOS-4800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184318#comment-15184318 ] haosdent commented on MESOS-4800: - The problem here in fail case is it receive {{TASK_LOST}} before {{StatusUpdateManagerProcess::pause()}}. Then after resume, it resend {{TAKS_LOST}} twice. {code} // First time I0229 02:11:02.602648 2154 status_update_manager.cpp:181] Resuming sending status updates W0229 02:11:02.602721 2154 status_update_manager.cpp:188] Resending status update TASK_LOST (UUID: 9514b5e3-4a43-4593-b93f-d886e3791c84) for task 6f4f1f8c-2649-4c70-9767-2ea122a79101 of framework d551df7b-0c69-4bc9-b113-eca605384c49- I0229 02:11:02.602764 2154 status_update_manager.cpp:374] Forwarding update TASK_LOST (UUID: 9514b5e3-4a43-4593-b93f-d886e3791c84) for task 6f4f1f8c-2649-4c70-9767-2ea122a79101 of framework d551df7b-0c69-4bc9-b113-eca605384c49- to the slave // Second time. I0229 02:11:02.602999 2154 status_update_manager.cpp:181] Resuming sending status updates W0229 02:11:02.603032 2154 status_update_manager.cpp:188] Resending status update TASK_LOST (UUID: 9514b5e3-4a43-4593-b93f-d886e3791c84) for task 6f4f1f8c-2649-4c70-9767-2ea122a79101 of framework d551df7b-0c69-4bc9-b113-eca605384c49- I0229 02:11:02.603058 2154 status_update_manager.cpp:374] Forwarding update TASK_LOST (UUID: 9514b5e3-4a43-4593-b93f-d886e3791c84) for task 6f4f1f8c-2649-4c70-9767-2ea122a79101 of framework d551df7b-0c69-4bc9-b113-eca605384c49- to the slave {code} > SlaveRecoveryTest/0.RecoverTerminatedExecutor is flaky > -- > > Key: MESOS-4800 > URL: https://issues.apache.org/jira/browse/MESOS-4800 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar > Labels: flaky, flaky-test, mesosphere > > Showed up on ASF CI: > https://builds.apache.org/job/Mesos/COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/1743/changes > {code} > [ RUN ] SlaveRecoveryTest/0.RecoverTerminatedExecutor > I0229 02:11:01.321990 2124 leveldb.cpp:174] Opened db in 121.848194ms > I0229 02:11:01.363880 2124 leveldb.cpp:181] Compacted db in 41.823665ms > I0229 02:11:01.363965 2124 leveldb.cpp:196] Created db iterator in 27127ns > I0229 02:11:01.363984 2124 leveldb.cpp:202] Seeked to beginning of db in > 3446ns > I0229 02:11:01.363996 2124 leveldb.cpp:271] Iterated through 0 keys in the > db in 332ns > I0229 02:11:01.364050 2124 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0229 02:11:01.365196 2158 recover.cpp:447] Starting replica recovery > I0229 02:11:01.365492 2158 recover.cpp:473] Replica is in EMPTY status > I0229 02:11:01.366982 2151 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (9830)@172.17.0.3:36786 > I0229 02:11:01.367451 2149 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0229 02:11:01.368335 2149 recover.cpp:564] Updating replica status to > STARTING > I0229 02:11:01.372730 2158 master.cpp:375] Master > d551df7b-0c69-4bc9-b113-eca605384c49 (3036a6611147) started on > 172.17.0.3:36786 > I0229 02:11:01.372764 2158 master.cpp:377] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/e9RAjp/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="100secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.28.0/_inst/share/mesos/webui" > --work_dir="/tmp/e9RAjp/master" --zk_session_timeout="10secs" > I0229 02:11:01.373164 2158 master.cpp:422] Master only allowing > authenticated frameworks to register > I0229 02:11:01.373178 2158 master.cpp:427] Master only allowing > authenticated slaves to register > I0229 02:11:01.373188 2158 credentials.hpp:35] Loading credentials for > authentication from '/tmp/e9RAjp/credentials' > I0229 02:11:01.373612 2158 master.cpp:467] Using default 'crammd5' > authenticator > I0229 02:11:01.373793 2158 master.cpp:536] Using default
[jira] [Created] (MESOS-4896) Update isolators dynamically
Guangya Liu created MESOS-4896: -- Summary: Update isolators dynamically Key: MESOS-4896 URL: https://issues.apache.org/jira/browse/MESOS-4896 Project: Mesos Issue Type: Bug Reporter: Guangya Liu Assignee: Guangya Liu Currently, when using DOCKER as image provider but not enabling docker/runtime, the agent will exit with a message: {code} EXIT(1) << "Docker runtime isolator has to be specified if 'DOCKER' is included " << "in 'image_providers'. Please add 'docker/runtime' to '--isolation' " << "flags"; {code} This will bring some trouble to operator cause s/he needs some manual operations, it is better to enable agent to add this isolator dynamically based on image provider. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4868) PersistentVolumeTests do not need to set up ACLs.
[ https://issues.apache.org/jira/browse/MESOS-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4868: -- Shepherd: Adam B > PersistentVolumeTests do not need to set up ACLs. > - > > Key: MESOS-4868 > URL: https://issues.apache.org/jira/browse/MESOS-4868 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Joseph Wu >Assignee: Yong Tang > Labels: mesosphere, newbie, test > > The {{PersistentVolumeTest}} s have a custom helper for setting up ACLs in > the {{master::Flags}}: > {code} > ACLs acls; > hashset roles; > foreach (const FrameworkInfo& framework, frameworks) { > mesos::ACL::RegisterFramework* acl = acls.add_register_frameworks(); > acl->mutable_principals()->add_values(framework.principal()); > acl->mutable_roles()->add_values(framework.role()); > roles.insert(framework.role()); > } > flags.acls = acls; > flags.roles = strings::join(",", roles); > {code} > This is no longer necessary with implicit roles. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4709) Enable compiler optimization by default
[ https://issues.apache.org/jira/browse/MESOS-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184256#comment-15184256 ] Benjamin Mahler commented on MESOS-4709: Linked in MESOS-1985 for some context on why this was changed originally. > Enable compiler optimization by default > --- > > Key: MESOS-4709 > URL: https://issues.apache.org/jira/browse/MESOS-4709 > Project: Mesos > Issue Type: Improvement > Components: general >Reporter: Neil Conway >Assignee: Neil Conway > Labels: autoconf, configure, mesosphere > > At present, Mesos defaults to compiling with "-O0"; to enable compiler > optimizations, the user needs to specify "--enable-optimize" when running > {{configure}}. > We should change the default for the following reasons: > (1) The autoconf default for CFLAGS/CXXFLAGS is "-O2 -g". Anecdotally, > I think most software packages compile with a reasonable level of > optimizations enabled by default. > (2) I think we should make the default configure flags appropriate for > end-users (rather than Mesos developers): developers will be familiar > enough with Mesos to tune the configure flags according to their own > preferences. > (3) The performance consequences of not enabling compiler > optimizations can be pretty severe: 5x in a benchmark I just ran, and > we've seen between 2x and 30x (!) performance differences for some > real-world workloads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4895) Add more test cases to CommandExecutorTest
Guangya Liu created MESOS-4895: -- Summary: Add more test cases to CommandExecutorTest Key: MESOS-4895 URL: https://issues.apache.org/jira/browse/MESOS-4895 Project: Mesos Issue Type: Bug Reporter: Guangya Liu There is a new file named as https://github.com/apache/mesos/blob/master/src/tests/command_executor_tests.cpp was introduced for test case of command executor, but it is only covering some cases of task killing capability, it is better to add more test cases for this file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4890) FetcherCacheTest.LocalUncachedExtract and FetcherCacheHttpTest.HttpMixed fail as root on OSX
[ https://issues.apache.org/jira/browse/MESOS-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184237#comment-15184237 ] haosdent commented on MESOS-4890: - Hi, [~greggomann] Does the {{/tmp}} directory permission correct in your machine? I try in OSX, it could pass. > FetcherCacheTest.LocalUncachedExtract and FetcherCacheHttpTest.HttpMixed fail > as root on OSX > > > Key: MESOS-4890 > URL: https://issues.apache.org/jira/browse/MESOS-4890 > Project: Mesos > Issue Type: Bug > Components: tests >Affects Versions: 0.27.1 > Environment: OSX 10.10.5 >Reporter: Greg Mann > Labels: mesosphere, tests > > These two tests are failing as root on OSX due to the same error: > {code} > [ RUN ] FetcherCacheTest.LocalUncachedExtract > I0307 13:18:53.177228 1928930048 leveldb.cpp:174] Opened db in 1694us > I0307 13:18:53.177587 1928930048 leveldb.cpp:181] Compacted db in 332us > I0307 13:18:53.177618 1928930048 leveldb.cpp:196] Created db iterator in 15us > I0307 13:18:53.177633 1928930048 leveldb.cpp:202] Seeked to beginning of db > in 8us > I0307 13:18:53.177644 1928930048 leveldb.cpp:271] Iterated through 0 keys in > the db in 6us > I0307 13:18:53.177690 1928930048 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0307 13:18:53.178393 218832896 recover.cpp:447] Starting replica recovery > I0307 13:18:53.178628 218832896 recover.cpp:473] Replica is in EMPTY status > I0307 13:18:53.179527 216686592 replica.cpp:673] Replica in EMPTY status > received a broadcasted recover request from (4)@127.0.0.1:49563 > I0307 13:18:53.179769 218832896 recover.cpp:193] Received a recover response > from a replica in EMPTY status > I0307 13:18:53.179975 219906048 recover.cpp:564] Updating replica status to > STARTING > I0307 13:18:53.180225 220442624 leveldb.cpp:304] Persisting metadata (8 > bytes) to leveldb took 192us > I0307 13:18:53.180249 220442624 replica.cpp:320] Persisted replica status to > STARTING > I0307 13:18:53.180340 217223168 recover.cpp:473] Replica is in STARTING status > I0307 13:18:53.180753 216686592 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (5)@127.0.0.1:49563 > I0307 13:18:53.180891 218832896 recover.cpp:193] Received a recover response > from a replica in STARTING status > I0307 13:18:53.181082 216686592 recover.cpp:564] Updating replica status to > VOTING > I0307 13:18:53.181246 218296320 leveldb.cpp:304] Persisting metadata (8 > bytes) to leveldb took 100us > I0307 13:18:53.181268 218296320 replica.cpp:320] Persisted replica status to > VOTING > I0307 13:18:53.181325 217223168 recover.cpp:578] Successfully joined the > Paxos group > I0307 13:18:53.181427 217223168 recover.cpp:462] Recover process terminated > I0307 13:18:53.185133 218296320 master.cpp:375] Master > af5d4df4-703d-46f9-b5f7-95826f86abcd (localhost) started on 127.0.0.1:49563 > I0307 13:18:53.185169 218296320 master.cpp:377] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/private/tmp/BdBCVb/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="100secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" > --work_dir="/private/tmp/BdBCVb/master" --zk_session_timeout="10secs" > W0307 13:18:53.185725 218296320 master.cpp:380] > ** > Master bound to loopback interface! Cannot communicate with remote schedulers > or slaves. You might want to set '--ip' flag to a routable IP address. > ** > I0307 13:18:53.185766 218296320 master.cpp:422] Master only allowing > authenticated frameworks to register > I0307 13:18:53.185778 218296320 master.cpp:427] Master only allowing > authenticated slaves to register > I0307 13:18:53.185784 218296320 credentials.hpp:35] Loading credentials for > authentication from '/private/tmp/BdBCVb/credentials' > I0307 13:18:53.186089 218296320 master.cpp:467] Using default 'crammd5' > authenticator >
[jira] [Updated] (MESOS-4795) mesos agent not recovering after ZK init failure
[ https://issues.apache.org/jira/browse/MESOS-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent updated MESOS-4795: Description: Here's the sequence of events that happened: -Agent running fine with 0.24.1 -Transient ZK issues, slave flapping with zookeeper_init failure -ZK issue resolved -Most agents stop flapping and function correctly -Some agents continue flapping, but silent exit after printing the detector.cpp:481 log line. -The agents that continue to flap repaired with manual removal of contents in mesos-slave's working dir Here's the contents of the various log files on the agent: The .INFO logfile for one of the restarts before mesos-slave process exited with no other error messages: {code} Log file created at: 2016/02/09 02:12:48 Running on machine: titusagent-main-i-7697a9c5 Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg I0209 02:12:48.502403 97255 logging.cpp:172] INFO level logging started! I0209 02:12:48.502938 97255 main.cpp:185] Build: 2015-09-30 16:12:07 by builds I0209 02:12:48.502974 97255 main.cpp:187] Version: 0.24.1 I0209 02:12:48.503288 97255 containerizer.cpp:143] Using isolation: posix/cpu,posix/mem,filesystem/posix I0209 02:12:48.507961 97255 main.cpp:272] Starting Mesos slave I0209 02:12:48.509827 97296 slave.cpp:190] Slave started on 1)@10.138.146.230:7101 I0209 02:12:48.510074 97296 slave.cpp:191] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --attributes="region:us-east-1;" --authenticatee="" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" " I0209 02:12:48.511706 97296 slave.cpp:354] Slave resources: ports(*):[7150-7200]; mem(*):240135; cpus(*):32; disk(*):586104 I0209 02:12:48.512320 97296 slave.cpp:384] Slave hostname: I0209 02:12:48.512368 97296 slave.cpp:389] Slave checkpoint: true I0209 02:12:48.516139 97299 group.cpp:331] Group process (group(1)@10.138.146.230:7101) connected to ZooKeeper I0209 02:12:48.516216 97299 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0209 02:12:48.516253 97299 group.cpp:403] Trying to create path '/titus/main/mesos' in ZooKeeper I0209 02:12:48.520268 97275 detector.cpp:156] Detected a new leader: (id='209') I0209 02:12:48.520803 97284 group.cpp:674] Trying to get '/titus/main/mesos/json.info_000209' in ZooKeeper I0209 02:12:48.520874 97278 state.cpp:54] Recovering state from '/mnt/data/mesos/meta' I0209 02:12:48.520961 97278 state.cpp:690] Failed to find resources file '/mnt/data/mesos/meta/resources/resources.info' I0209 02:12:48.523680 97283 detector.cpp:481] A new leading master (UPID=master@10.230.95.110:7103) is detected {code} The .FATAL log file when the original transient ZK error occurred: {code} Log file created at: 2016/02/05 17:21:37 Running on machine: titusagent-main-i-7697a9c5 Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg F0205 17:21:37.395644 53841 zookeeper.cpp:110] Failed to create ZooKeeper, zookeeper_init: No such file or directory [2] {code} The .ERROR log file: {code} Log file created at: 2016/02/05 17:21:37 Running on machine: titusagent-main-i-7697a9c5 Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg F0205 17:21:37.395644 53841 zookeeper.cpp:110] Failed to create ZooKeeper, zookeeper_init: No such file or directory [2] {code} The .WARNING file had the same content. was: Here's the sequence of events that happened: -Agent running fine with 0.24.1 -Transient ZK issues, slave flapping with zookeeper_init failure -ZK issue resolved -Most agents stop flapping and function correctly -Some agents continue flapping, but silent exit after printing the detector.cpp:481 log line. -The agents that continue to flap repaired with manual removal of contents in mesos-slave's working dir Here's the contents of the various log files on the agent: The .INFO logfile for one of the restarts before mesos-slave process exited with no other error messages: Log file created at: 2016/02/09 02:12:48 Running on machine: titusagent-main-i-7697a9c5 Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg I0209 02:12:48.502403 97255 logging.cpp:172] INFO level logging started! I0209 02:12:48.502938 97255 main.cpp:185] Build: 2015-09-30 16:12:07 by builds I0209 02:12:48.502974 97255 main.cpp:187] Version: 0.24.1 I0209 02:12:48.503288 97255 containerizer.cpp:143] Using isolation: posix/cpu,posix/mem,filesystem/posix I0209 02:12:48.507961 97255 main.cpp:272] Starting Mesos slave I0209 02:12:48.509827 97296 slave.cpp:190] Slave started on 1)@10.138.146.230:7101 I0209 02:12:48.510074 97296 slave.cpp:191] Flags at startup: --appc_store_dir="/tmp/mesos/store/appc" --attributes="region:us-east-1;" --authenticatee=""
[jira] [Commented] (MESOS-4189) Dynamic weights
[ https://issues.apache.org/jira/browse/MESOS-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184205#comment-15184205 ] Yongqiao Wang commented on MESOS-4189: -- OK, Thanks Adam. I will follow up those tasks ASAP. Cloud you help to review the RR #41681, #41790 and #43863? and in order to reduce the conflicts, let us commit them firstly, then it will be easily to do the following tasks based on that code base. > Dynamic weights > --- > > Key: MESOS-4189 > URL: https://issues.apache.org/jira/browse/MESOS-4189 > Project: Mesos > Issue Type: Epic >Reporter: Yongqiao Wang >Assignee: Yongqiao Wang > > Mesos current uses a static list of weights that are configured when the > master startup(via the --weights flag), this place some limitation about > change the resource allocation priority for a role/frameworks(changing the > set of weights requires restarting all the masters). > This JIRA will add a new endpoint /weight to update/show weight of a role > with the authorized principles, and the non-default weights will be persisted > in registry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4705) Slave failed to sample container with perf event
[ https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4705: --- Shepherd: Benjamin Mahler Sorry for the delay, thanks for looking into this! I left some comments on the review. > Slave failed to sample container with perf event > > > Key: MESOS-4705 > URL: https://issues.apache.org/jira/browse/MESOS-4705 > Project: Mesos > Issue Type: Bug > Components: cgroups, isolation >Affects Versions: 0.27.1 >Reporter: Fan Du >Assignee: Fan Du > > When sampling container with perf event on Centos7 with kernel > 3.10.0-123.el7.x86_64, slave complained with below error spew: > {code} > E0218 16:32:00.591181 8376 perf_event.cpp:408] Failed to get perf sample: > Failed to parse perf sample: Failed to parse perf sample line > '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00': > Unexpected number of fields > {code} > it's caused by the current perf format [assumption | > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430] > with kernel version below 3.12 > On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below: > value,unit,event,cgroup,running,ratio > A local modification fixed this error on my test bed, please review this > ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4840) Remove internal usage of deprecated ShutdownFramework ACL
[ https://issues.apache.org/jira/browse/MESOS-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4840: -- Summary: Remove internal usage of deprecated ShutdownFramework ACL (was: Remove ShutdownFramework from the ACLs messages and references) > Remove internal usage of deprecated ShutdownFramework ACL > - > > Key: MESOS-4840 > URL: https://issues.apache.org/jira/browse/MESOS-4840 > Project: Mesos > Issue Type: Task > Components: master, security, technical debt >Affects Versions: 0.28.0 >Reporter: Alexander Rojas >Assignee: Alexander Rojas >Priority: Minor > Labels: deprecation, mesosphere > > {{ShutdownFramework}} acl was deprecated a couple of versions ago in favor of > the {{TeardownFramework}} message. Its deprecation cycle came with 0.27. That > means we should remove the message and its references in the code base. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4840) Remove ShutdownFramework from the ACLs messages and references
[ https://issues.apache.org/jira/browse/MESOS-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184144#comment-15184144 ] Adam B commented on MESOS-4840: --- Done. I'm running this through CI, then I'm ready to commit it. > Remove ShutdownFramework from the ACLs messages and references > -- > > Key: MESOS-4840 > URL: https://issues.apache.org/jira/browse/MESOS-4840 > Project: Mesos > Issue Type: Task > Components: master, security, technical debt >Affects Versions: 0.28.0 >Reporter: Alexander Rojas >Assignee: Alexander Rojas >Priority: Minor > Labels: deprecation, mesosphere > > {{ShutdownFramework}} acl was deprecated a couple of versions ago in favor of > the {{TeardownFramework}} message. Its deprecation cycle came with 0.27. That > means we should remove the message and its references in the code base. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4840) Remove ShutdownFramework from the ACLs messages and references
[ https://issues.apache.org/jira/browse/MESOS-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184133#comment-15184133 ] Vinod Kone commented on MESOS-4840: --- Can you add it to sprint/add shepherd/add story points ? > Remove ShutdownFramework from the ACLs messages and references > -- > > Key: MESOS-4840 > URL: https://issues.apache.org/jira/browse/MESOS-4840 > Project: Mesos > Issue Type: Task > Components: master, security, technical debt >Affects Versions: 0.28.0 >Reporter: Alexander Rojas >Assignee: Alexander Rojas >Priority: Minor > Labels: deprecation, mesosphere > > {{ShutdownFramework}} acl was deprecated a couple of versions ago in favor of > the {{TeardownFramework}} message. Its deprecation cycle came with 0.27. That > means we should remove the message and its references in the code base. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4893) Allow setting permissions and access control on persistent volumes
Anindya Sinha created MESOS-4893: Summary: Allow setting permissions and access control on persistent volumes Key: MESOS-4893 URL: https://issues.apache.org/jira/browse/MESOS-4893 Project: Mesos Issue Type: Improvement Components: general Reporter: Anindya Sinha Assignee: Anindya Sinha Currently, persistent volumes are exclusive, i.e. that if a persistent volume is used by one task or executor, it cannot be concurrently used by other task or executor. With the introduction of shared volumes, persistent volumes can be used simultaneously by multiple tasks or executors. As a result, we need to introduce setting up of ownership of persistent volumes at creation of volumes which the tasks need to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4892) Support arithmetic operations for shared resources with consumer counts
Anindya Sinha created MESOS-4892: Summary: Support arithmetic operations for shared resources with consumer counts Key: MESOS-4892 URL: https://issues.apache.org/jira/browse/MESOS-4892 Project: Mesos Issue Type: Improvement Components: general Reporter: Anindya Sinha Assignee: Anindya Sinha With the introduction of shared resources, we need to add support for arithmetic operations on Resources which perform such operations on shared resources. Shared resources need to be handled differently so as to account for incrementing/decrementing consumer counts maintained by each Resources object. Case 1: Resources total += shared_resource; If shared_resource exists in total, this would imply that the consumer count is incremented. If shared_resource does not exist in total, this would imply we start tracking consumers for this shared resource initialized to 0 consumers. Case 2 Resources total -= shared_resource; If shared_resource exists in total, this would imply that the consumer count is decremented. However, the shared_resource is removed from total if the consumer count is originally 0 in total). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4891) Add a '/containers' endpoint to the agent to list all the active containers.
[ https://issues.apache.org/jira/browse/MESOS-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184017#comment-15184017 ] Sargun Dhillon commented on MESOS-4891: --- Can we also have a place to list all executor PIDs that are associated with those containers? > Add a '/containers' endpoint to the agent to list all the active containers. > > > Key: MESOS-4891 > URL: https://issues.apache.org/jira/browse/MESOS-4891 > Project: Mesos > Issue Type: Improvement >Reporter: Jie Yu > > This endpoint will be similar to /monitor/statistics.json endpoint, but it'll > also contain the 'container_status' about the container (see ContainerStatus > in mesos.proto). We'll eventually deprecate the /monitor/statistics.json > endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4740) Improve master metrics/snapshot performace
[ https://issues.apache.org/jira/browse/MESOS-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cong Wang updated MESOS-4740: - Description: [~drobinson] noticed retrieving metrics/snapshot statistics could be very inefficient. {noformat} [user@server ~]$ time curl -s localhost:5050/metrics/snapshot real0m35.654s user0m0.019s sys 0m0.011s {noformat} MESOS-1287 introduces a timeout parameter for this query, but for metric-collectors like ours they are not aware of such URL-specific parameter, so we need: 1) We should always have a timeout and set some default value to it 2) Investigate why master metrics/snapshot could take such a long time to complete under load. was: [~drobinson] noticed retrieving metrics/snapshot statistics could be very inefficient. {noformat} [user@server ~]$ time curl -s localhost:5050/metrics/snapshot real0m35.654s user0m0.019s sys 0m0.011s {noformat} MESOS-1287 introduces a timeout parameter for this query, but for metric-collectors like ours they are not aware of such URL-specific parameter, so we need: 1) We should always have a timeout and set some default value to it 2) Investigate why metrics/snapshot could take such a long time to complete under load, since we don't use history for these statistics and the values are just some atomic read. > Improve master metrics/snapshot performace > -- > > Key: MESOS-4740 > URL: https://issues.apache.org/jira/browse/MESOS-4740 > Project: Mesos > Issue Type: Task >Reporter: Cong Wang >Assignee: Cong Wang > > [~drobinson] noticed retrieving metrics/snapshot statistics could be very > inefficient. > {noformat} > [user@server ~]$ time curl -s localhost:5050/metrics/snapshot > real 0m35.654s > user 0m0.019s > sys 0m0.011s > {noformat} > MESOS-1287 introduces a timeout parameter for this query, but for > metric-collectors like ours they are not aware of such URL-specific > parameter, so we need: > 1) We should always have a timeout and set some default value to it > 2) Investigate why master metrics/snapshot could take such a long time to > complete under load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4740) Improve master metrics/snapshot performace
[ https://issues.apache.org/jira/browse/MESOS-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cong Wang updated MESOS-4740: - Summary: Improve master metrics/snapshot performace (was: Improve metrics/snapshot performace) > Improve master metrics/snapshot performace > -- > > Key: MESOS-4740 > URL: https://issues.apache.org/jira/browse/MESOS-4740 > Project: Mesos > Issue Type: Task >Reporter: Cong Wang >Assignee: Cong Wang > > [~drobinson] noticed retrieving metrics/snapshot statistics could be very > inefficient. > {noformat} > [user@server ~]$ time curl -s localhost:5050/metrics/snapshot > real 0m35.654s > user 0m0.019s > sys 0m0.011s > {noformat} > MESOS-1287 introduces a timeout parameter for this query, but for > metric-collectors like ours they are not aware of such URL-specific > parameter, so we need: > 1) We should always have a timeout and set some default value to it > 2) Investigate why metrics/snapshot could take such a long time to complete > under load, since we don't use history for these statistics and the values > are just some atomic read. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2840) MesosContainerizer support multiple image provisioners
[ https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Downes updated MESOS-2840: -- Description: We want to utilize the Appc integration interfaces to further make MesosContainerizers to support multiple image formats. This allows our future work on isolators to support any container image format. Design [open to public comments] < please use this document! https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing -[[original document|https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing] requires permission]- was: We want to utilize the Appc integration interfaces to further make MesosContainerizers to support multiple image formats. This allows our future work on isolators to support any container image format. Design [open to public comments] < please use this document! https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing [[original document|https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing] requires permission] > MesosContainerizer support multiple image provisioners > -- > > Key: MESOS-2840 > URL: https://issues.apache.org/jira/browse/MESOS-2840 > Project: Mesos > Issue Type: Epic > Components: containerization, docker >Affects Versions: 0.23.0 >Reporter: Marco Massenzio >Assignee: Timothy Chen > Labels: mesosphere, twitter > > We want to utilize the Appc integration interfaces to further make > MesosContainerizers to support multiple image formats. > This allows our future work on isolators to support any container image > format. > Design > [open to public comments] < please use this document! > https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing > -[[original > document|https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing] > requires permission]- -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2840) MesosContainerizer support multiple image provisioners
[ https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Downes updated MESOS-2840: -- Description: We want to utilize the Appc integration interfaces to further make MesosContainerizers to support multiple image formats. This allows our future work on isolators to support any container image format. Design [open to public comments] please use this document! https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing -[[original document|https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing] requires permission]- was: We want to utilize the Appc integration interfaces to further make MesosContainerizers to support multiple image formats. This allows our future work on isolators to support any container image format. Design [open to public comments] < please use this document! https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing -[[original document|https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing] requires permission]- > MesosContainerizer support multiple image provisioners > -- > > Key: MESOS-2840 > URL: https://issues.apache.org/jira/browse/MESOS-2840 > Project: Mesos > Issue Type: Epic > Components: containerization, docker >Affects Versions: 0.23.0 >Reporter: Marco Massenzio >Assignee: Timothy Chen > Labels: mesosphere, twitter > > We want to utilize the Appc integration interfaces to further make > MesosContainerizers to support multiple image formats. > This allows our future work on isolators to support any container image > format. > Design > [open to public comments] please use this document! > https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing > -[[original > document|https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing] > requires permission]- -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4891) Add a '/containers' endpoint to the agent to list all the active containers.
Jie Yu created MESOS-4891: - Summary: Add a '/containers' endpoint to the agent to list all the active containers. Key: MESOS-4891 URL: https://issues.apache.org/jira/browse/MESOS-4891 Project: Mesos Issue Type: Improvement Reporter: Jie Yu This endpoint will be similar to /monitor/statistics.json endpoint, but it'll also contain the 'container_status' about the container (see ContainerStatus in mesos.proto). We'll eventually deprecate the /monitor/statistics.json endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2840) MesosContainerizer support multiple image provisioners
[ https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Downes updated MESOS-2840: -- Description: We want to utilize the Appc integration interfaces to further make MesosContainerizers to support multiple image formats. This allows our future work on isolators to support any container image format. Design [open to public comments] < please use this document! https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing [original document, requires permission] -https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing- was: We want to utilize the Appc integration interfaces to further make MesosContainerizers to support multiple image formats. This allows our future work on isolators to support any container image format. Design [open to public comments] https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing [original document, requires permission] https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing > MesosContainerizer support multiple image provisioners > -- > > Key: MESOS-2840 > URL: https://issues.apache.org/jira/browse/MESOS-2840 > Project: Mesos > Issue Type: Epic > Components: containerization, docker >Affects Versions: 0.23.0 >Reporter: Marco Massenzio >Assignee: Timothy Chen > Labels: mesosphere, twitter > > We want to utilize the Appc integration interfaces to further make > MesosContainerizers to support multiple image formats. > This allows our future work on isolators to support any container image > format. > Design > [open to public comments] < please use this document! > https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing > [original document, requires permission] > -https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing- -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2840) MesosContainerizer support multiple image provisioners
[ https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Downes updated MESOS-2840: -- Description: We want to utilize the Appc integration interfaces to further make MesosContainerizers to support multiple image formats. This allows our future work on isolators to support any container image format. Design [open to public comments] < please use this document! https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing [[original document|https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing] requires permission] was: We want to utilize the Appc integration interfaces to further make MesosContainerizers to support multiple image formats. This allows our future work on isolators to support any container image format. Design [open to public comments] < please use this document! https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing [original document, requires permission] -https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing- > MesosContainerizer support multiple image provisioners > -- > > Key: MESOS-2840 > URL: https://issues.apache.org/jira/browse/MESOS-2840 > Project: Mesos > Issue Type: Epic > Components: containerization, docker >Affects Versions: 0.23.0 >Reporter: Marco Massenzio >Assignee: Timothy Chen > Labels: mesosphere, twitter > > We want to utilize the Appc integration interfaces to further make > MesosContainerizers to support multiple image formats. > This allows our future work on isolators to support any container image > format. > Design > [open to public comments] < please use this document! > https://docs.google.com/document/d/1oUpJNjJ0l51fxaYut21mKPwJUiAcAdgbdF7SAdAW2PA/edit?usp=sharing > [[original > document|https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing] > requires permission] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4883) Add agent ID to agent state endpoint
[ https://issues.apache.org/jira/browse/MESOS-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183895#comment-15183895 ] Sargun Dhillon commented on MESOS-4883: --- We have a tool here that looks at the agent state.json and assembles a complete cluster view based on the sum of the agent JSONs. This system is soft-state. If the last state we have in memory has a bunch of tasks associated with this slave, and this slave comes back and runs for a while (10m), and we don't see any tasks, we do not know to take those old tasks and remove them from the system. > Add agent ID to agent state endpoint > > > Key: MESOS-4883 > URL: https://issues.apache.org/jira/browse/MESOS-4883 > Project: Mesos > Issue Type: Improvement >Reporter: Sargun Dhillon >Priority: Minor > Labels: mesosphere > > I would like to have the slave ID exposed on the slave before any tasks are > running on the slave on the state.json endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4492) Add metrics for {RESERVE, UNRESERVE} and {CREATE, DESTROY} offer operation
[ https://issues.apache.org/jira/browse/MESOS-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-4492: - Shepherd: Jie Yu > Add metrics for {RESERVE, UNRESERVE} and {CREATE, DESTROY} offer operation > -- > > Key: MESOS-4492 > URL: https://issues.apache.org/jira/browse/MESOS-4492 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Fan Du >Assignee: Fan Du >Priority: Minor > > This ticket aims to enable user or operator to inspect operation statistics > such as RESERVE, UNRESERVE, CREATE and DESTROY, current implementation only > supports LAUNCH. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4849) Add agent flags for HTTP authentication
[ https://issues.apache.org/jira/browse/MESOS-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-4849: - Shepherd: Adam B > Add agent flags for HTTP authentication > --- > > Key: MESOS-4849 > URL: https://issues.apache.org/jira/browse/MESOS-4849 > Project: Mesos > Issue Type: Task > Components: security, slave >Reporter: Adam B >Assignee: Greg Mann > Labels: mesosphere, security > > Flags should be added to the agent to: > 1. Enable HTTP authentication ({{--authenticate_http}}) > 2. Specify credentials ({{--http_credentials}}) > 3. Specify HTTP authenticators ({{--authenticators}}) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4848) Agent Authn Research Spike
[ https://issues.apache.org/jira/browse/MESOS-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-4848: - Shepherd: Adam B > Agent Authn Research Spike > -- > > Key: MESOS-4848 > URL: https://issues.apache.org/jira/browse/MESOS-4848 > Project: Mesos > Issue Type: Task > Components: security, slave >Reporter: Adam B >Assignee: Greg Mann > Labels: mesosphere, security > > Research the master authentication flags to see what changes will be > necessary for agent http authentication. > Write up a 1-2 page summary/design doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3302) Scheduler API v1 improvements
[ https://issues.apache.org/jira/browse/MESOS-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3302: -- Assignee: (was: Marco Massenzio) > Scheduler API v1 improvements > - > > Key: MESOS-3302 > URL: https://issues.apache.org/jira/browse/MESOS-3302 > Project: Mesos > Issue Type: Epic >Reporter: Marco Massenzio > Labels: mesosphere, twitter > > This Epic covers all the refinements that we may want to build on top of the > {{HTTP API}} MVP epic (MESOS-2288) which was released initially with Mesos > {{0.24.0}}. > The tasks/stories here cover the necessary work to bring the API v1 to what > we would regard as "Production-ready" state in preparation for the {{1.0.0}} > release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4890) FetcherCacheTest.LocalUncachedExtract and FetcherCacheHttpTest.HttpMixed fail as root on OSX
Greg Mann created MESOS-4890: Summary: FetcherCacheTest.LocalUncachedExtract and FetcherCacheHttpTest.HttpMixed fail as root on OSX Key: MESOS-4890 URL: https://issues.apache.org/jira/browse/MESOS-4890 Project: Mesos Issue Type: Bug Components: tests Affects Versions: 0.27.1 Environment: OSX 10.10.5 Reporter: Greg Mann These two tests are failing as root on OSX due to the same error: {code} [ RUN ] FetcherCacheTest.LocalUncachedExtract I0307 13:18:53.177228 1928930048 leveldb.cpp:174] Opened db in 1694us I0307 13:18:53.177587 1928930048 leveldb.cpp:181] Compacted db in 332us I0307 13:18:53.177618 1928930048 leveldb.cpp:196] Created db iterator in 15us I0307 13:18:53.177633 1928930048 leveldb.cpp:202] Seeked to beginning of db in 8us I0307 13:18:53.177644 1928930048 leveldb.cpp:271] Iterated through 0 keys in the db in 6us I0307 13:18:53.177690 1928930048 replica.cpp:779] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I0307 13:18:53.178393 218832896 recover.cpp:447] Starting replica recovery I0307 13:18:53.178628 218832896 recover.cpp:473] Replica is in EMPTY status I0307 13:18:53.179527 216686592 replica.cpp:673] Replica in EMPTY status received a broadcasted recover request from (4)@127.0.0.1:49563 I0307 13:18:53.179769 218832896 recover.cpp:193] Received a recover response from a replica in EMPTY status I0307 13:18:53.179975 219906048 recover.cpp:564] Updating replica status to STARTING I0307 13:18:53.180225 220442624 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 192us I0307 13:18:53.180249 220442624 replica.cpp:320] Persisted replica status to STARTING I0307 13:18:53.180340 217223168 recover.cpp:473] Replica is in STARTING status I0307 13:18:53.180753 216686592 replica.cpp:673] Replica in STARTING status received a broadcasted recover request from (5)@127.0.0.1:49563 I0307 13:18:53.180891 218832896 recover.cpp:193] Received a recover response from a replica in STARTING status I0307 13:18:53.181082 216686592 recover.cpp:564] Updating replica status to VOTING I0307 13:18:53.181246 218296320 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 100us I0307 13:18:53.181268 218296320 replica.cpp:320] Persisted replica status to VOTING I0307 13:18:53.181325 217223168 recover.cpp:578] Successfully joined the Paxos group I0307 13:18:53.181427 217223168 recover.cpp:462] Recover process terminated I0307 13:18:53.185133 218296320 master.cpp:375] Master af5d4df4-703d-46f9-b5f7-95826f86abcd (localhost) started on 127.0.0.1:49563 I0307 13:18:53.185169 218296320 master.cpp:377] Flags at startup: --acls="" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" --authenticators="crammd5" --authorizers="local" --credentials="/private/tmp/BdBCVb/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" --quiet="false" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="100secs" --registry_strict="true" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/private/tmp/BdBCVb/master" --zk_session_timeout="10secs" W0307 13:18:53.185725 218296320 master.cpp:380] ** Master bound to loopback interface! Cannot communicate with remote schedulers or slaves. You might want to set '--ip' flag to a routable IP address. ** I0307 13:18:53.185766 218296320 master.cpp:422] Master only allowing authenticated frameworks to register I0307 13:18:53.185778 218296320 master.cpp:427] Master only allowing authenticated slaves to register I0307 13:18:53.185784 218296320 credentials.hpp:35] Loading credentials for authentication from '/private/tmp/BdBCVb/credentials' I0307 13:18:53.186089 218296320 master.cpp:467] Using default 'crammd5' authenticator I0307 13:18:53.186130 218296320 authenticator.cpp:518] Initializing server SASL I0307 13:18:53.204093 218296320 master.cpp:536] Using default 'basic' HTTP authenticator I0307 13:18:53.204290 218296320 master.cpp:570] Authorization enabled I0307 13:18:53.207252 216686592 master.cpp:1711] The newly elected leader is master@127.0.0.1:49563 with id af5d4df4-703d-46f9-b5f7-95826f86abcd I0307 13:18:53.207278 216686592 master.cpp:1724] Elected as the leading master! I0307 13:18:53.207285 216686592 master.cpp:1469] Recovering from registrar
[jira] [Created] (MESOS-4889) Implement runtime isolator tests.
Gilbert Song created MESOS-4889: --- Summary: Implement runtime isolator tests. Key: MESOS-4889 URL: https://issues.apache.org/jira/browse/MESOS-4889 Project: Mesos Issue Type: Task Components: containerization Reporter: Gilbert Song Assignee: Gilbert Song There different cases in docker runtime isolator. Some special cases should be tested with unique test case, to verify the docker runtime isolator logic is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4813) Implement base tests for unified container using local puller.
[ https://issues.apache.org/jira/browse/MESOS-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gilbert Song updated MESOS-4813: Summary: Implement base tests for unified container using local puller. (was: Implement base tests for unified container using local registry.) > Implement base tests for unified container using local puller. > -- > > Key: MESOS-4813 > URL: https://issues.apache.org/jira/browse/MESOS-4813 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Gilbert Song >Assignee: Gilbert Song > Labels: containerizer > > Using command line executor to test shell commands with local docker images. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4888) Default cmd is executed as an incorrect command.
Gilbert Song created MESOS-4888: --- Summary: Default cmd is executed as an incorrect command. Key: MESOS-4888 URL: https://issues.apache.org/jira/browse/MESOS-4888 Project: Mesos Issue Type: Bug Components: containerization Reporter: Gilbert Song Assignee: Gilbert Song When mesos containerizer launch a container using a docker image, which only container default Cmd. The executable command is is a incorrect sequence. For example: If an image default entrypoint is null, cmd is "sh", user defines shell=false, value is none, and arguments as [-c, echo 'hello world']. The executable command is `[sh, -c, echo 'hello world', sh]`, which is incorrect. It should be `[sh, sh, -c, echo 'hello world']` instead. This problem is only exposed for the case: sh=0, value=0, argv=1, entrypoint=0, cmd=1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4126) Construct the error string in `MethodNotAllowed`.
[ https://issues.apache.org/jira/browse/MESOS-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-4126: --- Shepherd: Alexander Rukletsov > Construct the error string in `MethodNotAllowed`. > - > > Key: MESOS-4126 > URL: https://issues.apache.org/jira/browse/MESOS-4126 > Project: Mesos > Issue Type: Improvement >Reporter: Alexander Rukletsov >Assignee: Jacob Janco > Labels: http, mesosphere, newbie++ > > Consider constructing the error string in {{MethodNotAllowed}} rather than at > the invocation site. Currently we want all error messages follow the same > pattern, so instead of writing > {code} > return MethodNotAllowed({"POST"}, "Expecting 'POST', received '" + > request.method + "'"); > {code} > we can write something like > {code} > MethodNotAllowed({"POST"}, request.method)` > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4883) Add agent ID to agent state endpoint
[ https://issues.apache.org/jira/browse/MESOS-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183676#comment-15183676 ] Vinod Kone commented on MESOS-4883: --- Can you provide more context/motivation? > Add agent ID to agent state endpoint > > > Key: MESOS-4883 > URL: https://issues.apache.org/jira/browse/MESOS-4883 > Project: Mesos > Issue Type: Improvement >Reporter: Sargun Dhillon >Priority: Minor > Labels: mesosphere > > I would like to have the slave ID exposed on the slave before any tasks are > running on the slave on the state.json endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4887) Design doc for Slave/Agent rename
Vinod Kone created MESOS-4887: - Summary: Design doc for Slave/Agent rename Key: MESOS-4887 URL: https://issues.apache.org/jira/browse/MESOS-4887 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Diana Arroyo Design doc: https://docs.google.com/document/d/1P8_4wdk29I6NoVTjbFkRl05-tfxV9PY4WLoRNvExupM/edit#heading=h.9g7fqjh6652v -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4886) Support mesos containerizer force_pull_image option.
Gilbert Song created MESOS-4886: --- Summary: Support mesos containerizer force_pull_image option. Key: MESOS-4886 URL: https://issues.apache.org/jira/browse/MESOS-4886 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Gilbert Song Assignee: Gilbert Song Currently for unified containerizer, images that are already cached by metadata manager cannot be updated. User has to delete corresponding images in store if an update is need. We should support `force_pull_image` option for unified containerizer, to provide override option if existed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4885) Unzip should force overwrite
Tomasz Janiszewski created MESOS-4885: - Summary: Unzip should force overwrite Key: MESOS-4885 URL: https://issues.apache.org/jira/browse/MESOS-4885 Project: Mesos Issue Type: Bug Components: fetcher Reporter: Tomasz Janiszewski Priority: Trivial Consider situation when zip file is malformed and contains duplicated files . When fetcher downloads malformed zip file, that contains duplicated files (e.g., dist zips generated by gradle could have duplicated files in libs dir) and try to uncompress it, deployment hang in staged phase because unzip prompt if file should be replaced. unzip should overrite this file or break with error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4370) NetworkSettings.IPAddress field is deprecated in Docker
[ https://issues.apache.org/jira/browse/MESOS-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183425#comment-15183425 ] Robert Brockbank edited comment on MESOS-4370 at 3/7/16 6:41 PM: - I think getting better support for the --net option is a separate issue. At the moment, service discovery/DNS does not work with Docker 1.10 because of the relocation of the IP field in the inspection data. This issue *does* resolve that and I don't think we should be holding off on getting this into a release. It is not necessary to wait until we have improved --net support. As it stands today, specifying an additional --net option does work as a mechanism for using user-defined networks, and people are using it. I agree that the UX isn't ideal and we should aim to improve that, but DNS is actually broken and we do have a simple fix for that which could go in. was (Author: robbrockb...@gmail.com): I think getting better support for the --net option is a separate issue. At the moment, service discovery/DNS does not work with Docker 1.10 because of the relocation of the IP field in the inspection data. This issue *does* resolve that and I don't think we should be holding off on getting this into a release. It is not necessary to wait until we have improved --net support. As it stands the --net option does work today as a mechanism for using user-defined networks, and people are using it. I agree that the UX isn't ideal and we should aim to improve that, but DNS is actually broken and we do have a simple fix for that which could go in. > NetworkSettings.IPAddress field is deprecated in Docker > --- > > Key: MESOS-4370 > URL: https://issues.apache.org/jira/browse/MESOS-4370 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0, 0.26.0, 0.27.0 > Environment: Ubuntu 14.04 > Docker 1.9.1 >Reporter: Clint Armstrong >Assignee: Travis Hegner > > The latest docker API deprecates the NetworkSettings.IPAddress field, in > favor of the NetworkSettings.Networks field. > https://docs.docker.com/engine/reference/api/docker_remote_api/#v1-21-api-changes > With this deprecation, NetworkSettings.IPAddress is not populated for > containers running with networks that use new network plugins. > As a result the mesos API has no data in > container_status.network_infos.ip_address or > container_status.network_infos.ipaddresses. > The immediate impact of this is that mesos-dns is unable to retrieve a > containers IP from the netinfo interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4884) Ensure task_status timestamp is monotonic
Sargun Dhillon created MESOS-4884: - Summary: Ensure task_status timestamp is monotonic Key: MESOS-4884 URL: https://issues.apache.org/jira/browse/MESOS-4884 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon Priority: Critical In state.json the task status has a timestamp associated with it. From my understanding, the timestamp is when the task status update was generated. Although the slave guarantees that the list is sorted, and the first item of the list is the newest status. This becomes a problem if someone is independently getting the task status updates -- without the logic in the slave, we cannot determine the current state of the task. There exists a timestamp on the task. I would like the executor (API) to ensure that this timestamp is strictly monotonic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4370) NetworkSettings.IPAddress field is deprecated in Docker
[ https://issues.apache.org/jira/browse/MESOS-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183423#comment-15183423 ] Dan Osborne commented on MESOS-4370: I don't believe this is entirely true. Regardless of whether the launched container is using a User-defined Network or a regular docker networked container, the place that Mesos expects to find the IP has been moved in the docker api. Your fix addresses that, and restores Mesos ability to get the IP. Though RA42516 also concerns networking, I don't think it should prevent this from getting merged, as this will restore DNS and service discovery in Mesos for Docker 1.10+ > NetworkSettings.IPAddress field is deprecated in Docker > --- > > Key: MESOS-4370 > URL: https://issues.apache.org/jira/browse/MESOS-4370 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0, 0.26.0, 0.27.0 > Environment: Ubuntu 14.04 > Docker 1.9.1 >Reporter: Clint Armstrong >Assignee: Travis Hegner > > The latest docker API deprecates the NetworkSettings.IPAddress field, in > favor of the NetworkSettings.Networks field. > https://docs.docker.com/engine/reference/api/docker_remote_api/#v1-21-api-changes > With this deprecation, NetworkSettings.IPAddress is not populated for > containers running with networks that use new network plugins. > As a result the mesos API has no data in > container_status.network_infos.ip_address or > container_status.network_infos.ipaddresses. > The immediate impact of this is that mesos-dns is unable to retrieve a > containers IP from the netinfo interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4883) Add agent ID to agent state endpoint
Sargun Dhillon created MESOS-4883: - Summary: Add agent ID to agent state endpoint Key: MESOS-4883 URL: https://issues.apache.org/jira/browse/MESOS-4883 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon Priority: Minor I would like to have the slave ID exposed on the slave before any tasks are running on the slave on the state.json endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4370) NetworkSettings.IPAddress field is deprecated in Docker
[ https://issues.apache.org/jira/browse/MESOS-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183395#comment-15183395 ] Travis Hegner commented on MESOS-4370: -- Thank you [~robbrockb...@gmail.com] for you testing and interest in this patch. I've discovered that this patch only works out of pure luck in the way docker interprets multiple "--net" parameters. I have been stalling this patch as it will have to be re-worked to account for official user defined network support in mesos, via https://reviews.apache.org/r/42516/. I'd be happy to get a working fix merged in myself, but would prefer it be based on the patch linked above. > NetworkSettings.IPAddress field is deprecated in Docker > --- > > Key: MESOS-4370 > URL: https://issues.apache.org/jira/browse/MESOS-4370 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0, 0.26.0, 0.27.0 > Environment: Ubuntu 14.04 > Docker 1.9.1 >Reporter: Clint Armstrong >Assignee: Travis Hegner > > The latest docker API deprecates the NetworkSettings.IPAddress field, in > favor of the NetworkSettings.Networks field. > https://docs.docker.com/engine/reference/api/docker_remote_api/#v1-21-api-changes > With this deprecation, NetworkSettings.IPAddress is not populated for > containers running with networks that use new network plugins. > As a result the mesos API has no data in > container_status.network_infos.ip_address or > container_status.network_infos.ipaddresses. > The immediate impact of this is that mesos-dns is unable to retrieve a > containers IP from the netinfo interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4370) NetworkSettings.IPAddress field is deprecated in Docker
[ https://issues.apache.org/jira/browse/MESOS-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183358#comment-15183358 ] haosdent commented on MESOS-4370: - +1 for this. Docker 1.10.0 have already release more than 1 month. > NetworkSettings.IPAddress field is deprecated in Docker > --- > > Key: MESOS-4370 > URL: https://issues.apache.org/jira/browse/MESOS-4370 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0, 0.26.0, 0.27.0 > Environment: Ubuntu 14.04 > Docker 1.9.1 >Reporter: Clint Armstrong >Assignee: Travis Hegner > > The latest docker API deprecates the NetworkSettings.IPAddress field, in > favor of the NetworkSettings.Networks field. > https://docs.docker.com/engine/reference/api/docker_remote_api/#v1-21-api-changes > With this deprecation, NetworkSettings.IPAddress is not populated for > containers running with networks that use new network plugins. > As a result the mesos API has no data in > container_status.network_infos.ip_address or > container_status.network_infos.ipaddresses. > The immediate impact of this is that mesos-dns is unable to retrieve a > containers IP from the netinfo interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4370) NetworkSettings.IPAddress field is deprecated in Docker
[ https://issues.apache.org/jira/browse/MESOS-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183319#comment-15183319 ] Robert Brockbank edited comment on MESOS-4370 at 3/7/16 5:34 PM: - Is it possible to get this fix in the latest patch (0.28?). At the moment MesosDNS is not working when using the Docker Containerizer with Docker 1.10.1. We've tested a patch containing this fix and IP discovery and MesosDNS both then work as expected. Really keen to get this in a patch as soon as possible. was (Author: robbrockb...@gmail.com): Is it possible to get this fix in the latest patch (0.28?). At the moment MesosDNS is not working when using the Docker Containerizer with Docker 1.10.1 w(with user defined networks). We've tested a patch containing this fix and IP discovery and MesosDNS both then work as expected. Really keen to get this in a patch as soon as possible. > NetworkSettings.IPAddress field is deprecated in Docker > --- > > Key: MESOS-4370 > URL: https://issues.apache.org/jira/browse/MESOS-4370 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0, 0.26.0, 0.27.0 > Environment: Ubuntu 14.04 > Docker 1.9.1 >Reporter: Clint Armstrong >Assignee: Travis Hegner > > The latest docker API deprecates the NetworkSettings.IPAddress field, in > favor of the NetworkSettings.Networks field. > https://docs.docker.com/engine/reference/api/docker_remote_api/#v1-21-api-changes > With this deprecation, NetworkSettings.IPAddress is not populated for > containers running with networks that use new network plugins. > As a result the mesos API has no data in > container_status.network_infos.ip_address or > container_status.network_infos.ipaddresses. > The immediate impact of this is that mesos-dns is unable to retrieve a > containers IP from the netinfo interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4370) NetworkSettings.IPAddress field is deprecated in Docker
[ https://issues.apache.org/jira/browse/MESOS-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183319#comment-15183319 ] Robert Brockbank commented on MESOS-4370: - Is it possible to get this fix in the latest patch (0.28?). At the moment MesosDNS is not working when using the Docker Containerizer with Docker 1.10.1 w(with user defined networks). We've tested a patch containing this fix and IP discovery and MesosDNS both then work as expected. Really keen to get this in a patch as soon as possible. > NetworkSettings.IPAddress field is deprecated in Docker > --- > > Key: MESOS-4370 > URL: https://issues.apache.org/jira/browse/MESOS-4370 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0, 0.26.0, 0.27.0 > Environment: Ubuntu 14.04 > Docker 1.9.1 >Reporter: Clint Armstrong >Assignee: Travis Hegner > > The latest docker API deprecates the NetworkSettings.IPAddress field, in > favor of the NetworkSettings.Networks field. > https://docs.docker.com/engine/reference/api/docker_remote_api/#v1-21-api-changes > With this deprecation, NetworkSettings.IPAddress is not populated for > containers running with networks that use new network plugins. > As a result the mesos API has no data in > container_status.network_infos.ip_address or > container_status.network_infos.ipaddresses. > The immediate impact of this is that mesos-dns is unable to retrieve a > containers IP from the netinfo interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4279) Graceful restart of docker task
[ https://issues.apache.org/jira/browse/MESOS-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183205#comment-15183205 ] Martin Bydzovsky commented on MESOS-4279: - Are you sure [~qianzhang] you tried exactly {{vagrant up}} and then restart the app (marathon api/ui)? Because now I started digging and adding custom logs in the mesos codebase and recompile it around and around. And to me, the code seems like it had never ever worked. https://github.com/apache/mesos/blob/0.26.0/src/docker/executor.cpp#L219 - Just immediately after calling the docker->stop (with correct value btw - as I've inspected) you set {{killed=true}} and then, in the {{reaped}} method (which gets called immediately, you check for the {{killed}} flag to send wrong TASK_KILLED status update: https://github.com/apache/mesos/blob/0.26.0/src/docker/executor.cpp#L281. Finally, https://github.com/apache/mesos/blob/0.26.0/src/docker/executor.cpp#L308 stops the whole driver - which im not sure yet what that really means - but if thats the parent process of the docker executor, then it will kill the {{docker run}} process in a cascade. > Graceful restart of docker task > --- > > Key: MESOS-4279 > URL: https://issues.apache.org/jira/browse/MESOS-4279 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0 >Reporter: Martin Bydzovsky >Assignee: Qian Zhang > > I'm implementing a graceful restarts of our mesos-marathon-docker setup and I > came to a following issue: > (it was already discussed on > https://github.com/mesosphere/marathon/issues/2876 and guys form mesosphere > got to a point that its probably a docker containerizer problem...) > To sum it up: > When i deploy simple python script to all mesos-slaves: > {code} > #!/usr/bin/python > from time import sleep > import signal > import sys > import datetime > def sigterm_handler(_signo, _stack_frame): > print "got %i" % _signo > print datetime.datetime.now().time() > sys.stdout.flush() > sleep(2) > print datetime.datetime.now().time() > print "ending" > sys.stdout.flush() > sys.exit(0) > signal.signal(signal.SIGTERM, sigterm_handler) > signal.signal(signal.SIGINT, sigterm_handler) > try: > print "Hello" > i = 0 > while True: > i += 1 > print datetime.datetime.now().time() > print "Iteration #%i" % i > sys.stdout.flush() > sleep(1) > finally: > print "Goodbye" > {code} > and I run it through Marathon like > {code:javascript} > data = { > args: ["/tmp/script.py"], > instances: 1, > cpus: 0.1, > mem: 256, > id: "marathon-test-api" > } > {code} > During the app restart I get expected result - the task receives sigterm and > dies peacefully (during my script-specified 2 seconds period) > But when i wrap this python script in a docker: > {code} > FROM node:4.2 > RUN mkdir /app > ADD . /app > WORKDIR /app > ENTRYPOINT [] > {code} > and run appropriate application by Marathon: > {code:javascript} > data = { > args: ["./script.py"], > container: { > type: "DOCKER", > docker: { > image: "bydga/marathon-test-api" > }, > forcePullImage: yes > }, > cpus: 0.1, > mem: 256, > instances: 1, > id: "marathon-test-api" > } > {code} > The task during restart (issued from marathon) dies immediately without > having a chance to do any cleanup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4818) Add end to end testing for Appc images.
[ https://issues.apache.org/jira/browse/MESOS-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jojy Varghese updated MESOS-4818: - Sprint: Mesosphere Sprint 30 > Add end to end testing for Appc images. > --- > > Key: MESOS-4818 > URL: https://issues.apache.org/jira/browse/MESOS-4818 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Jojy Varghese >Assignee: Jojy Varghese > Labels: mesosphere, unified-containerizer-mvp > > Add tests that covers integration test of the Appc provisioner feature with > mesos containerizer. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3815) docker executor not works when SSL enable
[ https://issues.apache.org/jira/browse/MESOS-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182920#comment-15182920 ] Kevin Cox commented on MESOS-3815: -- It's also worth nothing that this affects non-docker executors as well. > docker executor not works when SSL enable > - > > Key: MESOS-3815 > URL: https://issues.apache.org/jira/browse/MESOS-3815 > Project: Mesos > Issue Type: Bug >Reporter: haosdent >Assignee: haosdent > Labels: docker, encryption, mesosphere, security, ssl > > Because docker executor not pass SSL related environment variables, > mesos-docker-executor could not works normal when SSL enable. More details > could found in http://search-hadoop.com/m/0Vlr6DsslDSvVs72 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4279) Graceful restart of docker task
[ https://issues.apache.org/jira/browse/MESOS-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182864#comment-15182864 ] AoJ commented on MESOS-4279: hi [~qianzhang], I fell into the same problem. I have clean ubuntu 14.04. Do you have any idea why this is happening and where it can be a problem? I tried to use attached vagrantfile uploaded by [~bydga] and it didn't work too. Tomas > Graceful restart of docker task > --- > > Key: MESOS-4279 > URL: https://issues.apache.org/jira/browse/MESOS-4279 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0 >Reporter: Martin Bydzovsky >Assignee: Qian Zhang > > I'm implementing a graceful restarts of our mesos-marathon-docker setup and I > came to a following issue: > (it was already discussed on > https://github.com/mesosphere/marathon/issues/2876 and guys form mesosphere > got to a point that its probably a docker containerizer problem...) > To sum it up: > When i deploy simple python script to all mesos-slaves: > {code} > #!/usr/bin/python > from time import sleep > import signal > import sys > import datetime > def sigterm_handler(_signo, _stack_frame): > print "got %i" % _signo > print datetime.datetime.now().time() > sys.stdout.flush() > sleep(2) > print datetime.datetime.now().time() > print "ending" > sys.stdout.flush() > sys.exit(0) > signal.signal(signal.SIGTERM, sigterm_handler) > signal.signal(signal.SIGINT, sigterm_handler) > try: > print "Hello" > i = 0 > while True: > i += 1 > print datetime.datetime.now().time() > print "Iteration #%i" % i > sys.stdout.flush() > sleep(1) > finally: > print "Goodbye" > {code} > and I run it through Marathon like > {code:javascript} > data = { > args: ["/tmp/script.py"], > instances: 1, > cpus: 0.1, > mem: 256, > id: "marathon-test-api" > } > {code} > During the app restart I get expected result - the task receives sigterm and > dies peacefully (during my script-specified 2 seconds period) > But when i wrap this python script in a docker: > {code} > FROM node:4.2 > RUN mkdir /app > ADD . /app > WORKDIR /app > ENTRYPOINT [] > {code} > and run appropriate application by Marathon: > {code:javascript} > data = { > args: ["./script.py"], > container: { > type: "DOCKER", > docker: { > image: "bydga/marathon-test-api" > }, > forcePullImage: yes > }, > cpus: 0.1, > mem: 256, > instances: 1, > id: "marathon-test-api" > } > {code} > The task during restart (issued from marathon) dies immediately without > having a chance to do any cleanup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3072) Unify initialization of modularized components
[ https://issues.apache.org/jira/browse/MESOS-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-3072: -- Fix Version/s: 0.27.0 > Unify initialization of modularized components > -- > > Key: MESOS-3072 > URL: https://issues.apache.org/jira/browse/MESOS-3072 > Project: Mesos > Issue Type: Improvement > Components: modules >Affects Versions: 0.22.0, 0.22.1, 0.23.0 >Reporter: Alexander Rojas >Assignee: Alexander Rojas > Labels: mesosphere > Fix For: 0.27.0 > > > h1.Introduction > As it stands right now, default implementations of modularized components are > required to have a non parametrized {{create()}} static method. This allows > to write tests which can cover default implementations and modules based on > these default implementations on a uniform way. > For example, with the interface {{Foo}}: > {code} > class Foo { > public: > virtual ~Foo() {} > virtual Future hello() = 0; > protected: > Foo() {} > }; > {code} > With a default implementation: > {code} > class LocalFoo { > public: > Trycreate() { > return new Foo; > } > virtual Future hello() { > return 1; > } > }; > {code} > This allows to create typed tests which look as following: > {code} > typedef ::testing::Types tests::Module > > FooTestTypes; > TYPED_TEST_CASE(FooTest, FooTestTypes); > TYPED_TEST(FooTest, ATest) > { > Try foo = TypeParam::create(); > ASSERT_SOME(foo); > AWAIT_CHECK_EQUAL(foo.get()->hello(), 1); > } > {code} > The test will be applied to each of types in the template parameters of > {{FooTestTypes}}. This allows to test different implementation of an > interface. In our code, it tests default implementations and a module which > uses the same default implementation. > The class {{tests::Module}} needs a little > explanation, it is a wrapper around {{ModuleManager}} which allows the tests > to encode information about the requested module in the type itself instead > of passing a string to the factory method. The wrapper around create, the > real important method looks as follows: > {code} > template > static Try test::Module ::create() > { > Try moduleName = getModuleName(N); > if (moduleName.isError()) { > return Error(moduleName.error()); > } > return mesos::modules::ModuleManager::create(moduleName.get()); > } > {code} > h1.The Problem > Consider the following implementation of {{Foo}}: > {code} > class ParameterFoo { > public: > Try create(int i) { > return new ParameterFoo(i); > } > ParameterFoo(int i) : i_(i) {} > virtual Future hello() { > return i; > } > private: > int i_; > }; > {code} > As it can be seen, this implementation cannot be used as a default > implementation since its create API does not match the one of > {{test::Module<>}}: {{create()}} has a different signature for both types. It > is still a common situation to require initialization parameters for objects, > however this constraint (keeping both interfaces alike) forces default > implementations of modularized components to have default constructors, > therefore the tests are forcing the design of the interfaces. > Implementations which are supposed to be used as modules only, i.e. non > default implementations are allowed to have constructor parameters, since the > actual signature of their factory method is, this factory method's function > is to decode the parameters and call the appropriate constructor: > {code} > template > T* Module::create(const Parameters& params); > {code} > where parameters is just an array of key-value string pairs whose > interpretation is left to the specific module. Sadly, this call is wrapped by > {{ModuleManager}} which only allows module parameters to be passed from the > command line and does not offer a programmatic way to feed construction > parameters to modules. > h1.The Ugly Workaround > With the requirement of a default constructor and parameters devoid > {{create()}} factory function, a common pattern (see > [Authenticator|https://github.com/apache/mesos/blob/9d4ac11ed757aa5869da440dfe5343a61b07199a/include/mesos/authentication/authenticator.hpp]) > has been introduced to feed construction parameters into default > implementation, this leads to adding an {{initialize()}} call to the public > interface, which will have {{Foo}} become: > {code} > class Foo { > public: > virtual ~Foo() {} > virtual Try initialize(Option i) = 0; > virtual Future hello() = 0; > protected: > Foo() {} > }; > {code} > {{ParameterFoo}} will thus look as follows: > {code} > class ParameterFoo { > public: > Try create() { > return new ParameterFoo; > } > ParameterFoo() : i_(None()) {} > virtual Try
[jira] [Commented] (MESOS-4772) TaskInfo/ExecutorInfo should include owner information
[ https://issues.apache.org/jira/browse/MESOS-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182781#comment-15182781 ] Adam B commented on MESOS-4772: --- 2b. Mesos authenticates the user accessing Mesos http endpoints, which may or may not be the same user accessing the framework's http UI to request that the framework launch a task on behalf of the user. Mesos authenticates the framework prior to its registration, but has no way (unless the framework tells it) to know which user launches a particular task. 4. Only an individual framework can authenticate and authorize users of its own UI. Mesos cannot intercept at this point, especially not without the framework's assistance. This ticket is about enabling frameworks to provide this information to Mesos on task launch, so that Mesos can later make authorization decisions based on this information (separate tickets). 5. `FrameworkInfo.user` is not necessarily related to user of the framework's UI (or Mesos' UI). It is the linux user which the framework's tasks will run as (see `RunTask` ACL) by default, if no `CommandInfo.user` is specified for the task/executor. Consider that Alice and Bob may both want to use the Hadoop framework to run tasks as the `hadoop` user. > TaskInfo/ExecutorInfo should include owner information > -- > > Key: MESOS-4772 > URL: https://issues.apache.org/jira/browse/MESOS-4772 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Adam B >Assignee: Jan Schlicht > Labels: authorization, mesosphere, ownership, security > > We need a way to assign fine-grained ownership to tasks/executors so that > multi-user frameworks can tell Mesos to associate the task with a user > identity (rather than just the framework principal+role). Then, when an HTTP > user requests to view the task's sandbox contents, or kill the task, or list > all tasks, the authorizer can determine whether to allow/deny/filter the > request based on finer-grained, user-level ownership. > Some systems may want TaskInfo.owner to represent a group rather than an > individual user. That's fine as long as the framework sets the field to the > group ID in such a way that a group-aware authorizer can interpret it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4882) Enabled mesos-execute treat command as executable value and arguments.
Guangya Liu created MESOS-4882: -- Summary: Enabled mesos-execute treat command as executable value and arguments. Key: MESOS-4882 URL: https://issues.apache.org/jira/browse/MESOS-4882 Project: Mesos Issue Type: Bug Reporter: Guangya Liu Assignee: Guangya Liu The commandInfo support two kind of command: {code} // There are two ways to specify the command: // 1) If 'shell == true', the command will be launched via shell //(i.e., /bin/sh -c 'value'). The 'value' specified will be //treated as the shell command. The 'arguments' will be ignored. // 2) If 'shell == false', the command will be launched by passing //arguments to an executable. The 'value' specified will be //treated as the filename of the executable. The 'arguments' //will be treated as the arguments to the executable. This is //similar to how POSIX exec families launch processes (i.e., //execlp(value, arguments(0), arguments(1), ...)). {code} The mesos-execute cannot handle 2) now, enabling 2) can help some unit test with isolator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)