[jira] [Updated] (MESOS-6569) MesosContainerizer/DefaultExecutorTest.KillTask/0 failing on ASF CI
[ https://issues.apache.org/jira/browse/MESOS-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-6569: Sprint: Mesosphere Sprint 47 Labels: flaky mesosphere newbie (was: flaky newbie) > MesosContainerizer/DefaultExecutorTest.KillTask/0 failing on ASF CI > --- > > Key: MESOS-6569 > URL: https://issues.apache.org/jira/browse/MESOS-6569 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.1.0 > Environment: > https://builds.apache.org/job/Mesos/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu:14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-6)&&(!ubuntu-eu2)/ >Reporter: Yan Xu >Assignee: Benjamin Bannier > Labels: flaky, mesosphere, newbie > Fix For: 1.2.0 > > > {noformat:title=} > [ RUN ] MesosContainerizer/DefaultExecutorTest.KillTask/0 > I1110 01:20:11.482097 29700 cluster.cpp:158] Creating default 'local' > authorizer > I1110 01:20:11.485241 29700 leveldb.cpp:174] Opened db in 2.774513ms > I1110 01:20:11.486237 29700 leveldb.cpp:181] Compacted db in 953614ns > I1110 01:20:11.486299 29700 leveldb.cpp:196] Created db iterator in 24739ns > I1110 01:20:11.486325 29700 leveldb.cpp:202] Seeked to beginning of db in > 2300ns > I1110 01:20:11.486344 29700 leveldb.cpp:271] Iterated through 0 keys in the > db in 378ns > I1110 01:20:11.486399 29700 replica.cpp:776] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1110 01:20:11.486933 29733 recover.cpp:451] Starting replica recovery > I1110 01:20:11.487289 29733 recover.cpp:477] Replica is in EMPTY status > I1110 01:20:11.488503 29721 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from __req_res__(7318)@172.17.0.3:52462 > I1110 01:20:11.488855 29727 recover.cpp:197] Received a recover response from > a replica in EMPTY status > I1110 01:20:11.489398 29729 recover.cpp:568] Updating replica status to > STARTING > I1110 01:20:11.490223 29723 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 575135ns > I1110 01:20:11.490284 29732 master.cpp:380] Master > d28fbae1-c3dc-45fa-8384-32ab9395a975 (3a31be8bf679) started on > 172.17.0.3:52462 > I1110 01:20:11.490317 29732 master.cpp:382] Flags at startup: --acls="" > --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate_agents="true" --authenticate_frameworks="true" > --authenticate_http_frameworks="true" --authenticate_http_readonly="true" > --authenticate_http_readwrite="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/k50x7x/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --http_authenticators="basic" --http_framework_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" > --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" > --quiet="false" --recovery_agent_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" > --registry_max_agent_count="102400" --registry_store_timeout="100secs" > --registry_strict="false" --root_submissions="true" --user_sorter="drf" > --version="false" --webui_dir="/mesos/mesos-1.2.0/_inst/share/mesos/webui" > --work_dir="/tmp/k50x7x/master" --zk_session_timeout="10secs" > I1110 01:20:11.490696 29732 master.cpp:432] Master only allowing > authenticated frameworks to register > I1110 01:20:11.490712 29732 master.cpp:446] Master only allowing > authenticated agents to register > I1110 01:20:11.490720 29732 master.cpp:459] Master only allowing > authenticated HTTP frameworks to register > I1110 01:20:11.490730 29732 credentials.hpp:37] Loading credentials for > authentication from '/tmp/k50x7x/credentials' > I1110 01:20:11.490281 29723 replica.cpp:320] Persisted replica status to > STARTING > I1110 01:20:11.491210 29732 master.cpp:504] Using default 'crammd5' > authenticator > I1110 01:20:11.491225 29720 recover.cpp:477] Replica is in STARTING status > I1110 01:20:11.491394 29732 http.cpp:895] Using default 'basic' HTTP > authenticator for realm 'mesos-master-readonly' > I1110 01:20:11.491621 29732 http.cpp:895] Using default 'basic' HTTP > authenticator for realm 'mesos-master-readwrite' > I1110 01:20:11.491770 29732 http.cpp:895] Using default 'basic' HTTP > authenticator for realm 'mesos-master-scheduler' > I1110 01:20:11.491937 29732 master.cpp:584] Authorization enabled > I1110 01:20:11.492276 29725 whitelist_watcher.cpp:77] No whitelist given > I1110
[jira] [Updated] (MESOS-6672) Class DynamicLibrary's default copy constructor can lead to inconsistent state
[ https://issues.apache.org/jira/browse/MESOS-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-6672: Sprint: Mesosphere Sprint 47 > Class DynamicLibrary's default copy constructor can lead to inconsistent state > -- > > Key: MESOS-6672 > URL: https://issues.apache.org/jira/browse/MESOS-6672 > Project: Mesos > Issue Type: Bug > Components: stout >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > Labels: mesosphere, tech-debt > Fix For: 1.2.0 > > > The class {{DynamicLibrary}} provides a RAII wrapper around a low-level > handle to a loaded library. Currently it supports copy- and move-construction > which would lead to two libraries holding handles to the same library. This > can e.g., lead to libraries being unloaded while other wrappers still hold > handles. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6646) StreamingRequestDecoder incompletely initializes its http_parser_settings
[ https://issues.apache.org/jira/browse/MESOS-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-6646: Sprint: Mesosphere Sprint 47 Labels: coverity mesosphere (was: coverity) > StreamingRequestDecoder incompletely initializes its http_parser_settings > - > > Key: MESOS-6646 > URL: https://issues.apache.org/jira/browse/MESOS-6646 > Project: Mesos > Issue Type: Bug > Components: libprocess >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > Labels: coverity, mesosphere > Fix For: 1.2.0 > > > Coverity reports in CID1394703 at {{3rdparty/libprocess/src/decoder.hpp:767}}: > {code} > CID 1394703 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR) > 2. uninit_member: Non-static class member field settings.on_status is not > initialized in this constructor nor in any functions that it calls. > {code} > It seems like {{StreamingRequestDecoder}} should properly initialize its > member {{settings}}, e.g., with {{http_parser_settings_init}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-6685) Update Role::Resources to correctly account for multi-role frameworks
[ https://issues.apache.org/jira/browse/MESOS-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Guo reassigned MESOS-6685: -- Assignee: Jay Guo > Update Role::Resources to correctly account for multi-role frameworks > - > > Key: MESOS-6685 > URL: https://issues.apache.org/jira/browse/MESOS-6685 > Project: Mesos > Issue Type: Bug >Reporter: Guangya Liu >Assignee: Jay Guo > > With single role framework, when call the get role endpoint, the master will > return resources for this role with all of the resources for a framework who > is using this role. But with multi-role framework, the get role endpoint > should only return resources used by one of the roles in a multi-role > framework. > {code} > Resources resources() const > { > Resources resources; > foreachvalue (Framework* framework, frameworks) { > resources += framework->totalUsedResources; > resources += framework->totalOfferedResources; > } > return resources; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6379) Updated the navbar in webui
[ https://issues.apache.org/jira/browse/MESOS-6379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent updated MESOS-6379: Summary: Updated the navbar in webui (was: Updated webui to material style) > Updated the navbar in webui > --- > > Key: MESOS-6379 > URL: https://issues.apache.org/jira/browse/MESOS-6379 > Project: Mesos > Issue Type: Improvement > Components: webui >Reporter: haosdent >Assignee: haosdent > Labels: web > Attachments: material_style.gif > > > Refer to [material style guideline | https://material.google.com/] After > some simple hacks, I found it should not too hard to update current WebUI to > material style. > We could use this library > https://github.com/FezVrasta/bootstrap-material-design . Its license is MIT > license which compatible with Apache 2.0, and same with the library Bootstrap > which has already used in Mesos. > Document: > https://docs.google.com/document/d/12JVUq-_zAUwvzz46KJVYgpfhRCx-TiH_ajfZG5KrutE/edit?usp=sharing > The following screenshot is the result after some changes in current WebUI. > !material_style.gif|width=800! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6646) StreamingRequestDecoder incompletely initializes its http_parser_settings
[ https://issues.apache.org/jira/browse/MESOS-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anand Mazumdar updated MESOS-6646: -- Shepherd: Anand Mazumdar > StreamingRequestDecoder incompletely initializes its http_parser_settings > - > > Key: MESOS-6646 > URL: https://issues.apache.org/jira/browse/MESOS-6646 > Project: Mesos > Issue Type: Bug > Components: libprocess >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > Labels: coverity > Fix For: 1.2.0 > > > Coverity reports in CID1394703 at {{3rdparty/libprocess/src/decoder.hpp:767}}: > {code} > CID 1394703 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR) > 2. uninit_member: Non-static class member field settings.on_status is not > initialized in this constructor nor in any functions that it calls. > {code} > It seems like {{StreamingRequestDecoder}} should properly initialize its > member {{settings}}, e.g., with {{http_parser_settings_init}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6688) IOSwitchboard should recover spawned server pid on agent restarts
[ https://issues.apache.org/jira/browse/MESOS-6688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724089#comment-15724089 ] Jie Yu commented on MESOS-6688: --- commit 4c80eaec0098c9664588fdd9b21cbc53e0822611 Author: Kevin KluesDate: Mon Dec 5 17:39:45 2016 -0800 Reorganized location of checkpointed files for the 'IOSwitchboard'. Review: https://reviews.apache.org/r/54401/ commit a1db860ac5a60887d7241da552d925158362bf8b Author: Kevin Klues Date: Mon Dec 5 17:39:42 2016 -0800 Added path helpers for checkpointing the io switchboard pid. Review: https://reviews.apache.org/r/54354/ commit 543533f09814b6ce7a7822acc3accb47379577a2 Author: Kevin Klues Date: Mon Dec 5 17:39:39 2016 -0800 Added short circuit for `local` mode in `IOSwitchboard::connect()'. Review: https://reviews.apache.org/r/54353/ > IOSwitchboard should recover spawned server pid on agent restarts > - > > Key: MESOS-6688 > URL: https://issues.apache.org/jira/browse/MESOS-6688 > Project: Mesos > Issue Type: Bug >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: debugging, mesosphere > > We need to do proper recovery of the io switchboard server pid across agent > restarts. As of now, if the agent restarts there is now way to recover this > pid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6724) The test "HTTPCommandExecutorTest.TerminateWithACK" is flaky
[ https://issues.apache.org/jira/browse/MESOS-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qian Zhang updated MESOS-6724: -- Component/s: test > The test "HTTPCommandExecutorTest.TerminateWithACK" is flaky > > > Key: MESOS-6724 > URL: https://issues.apache.org/jira/browse/MESOS-6724 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Qian Zhang > > It seems the test "HTTPCommandExecutorTest.TerminateWithACK" may fail when > the machine is under heavy load (e.g., using the “stress” utility to generate > load on the machine, like {{stress --cpu 4 --io 4 --timeout 120}}). > {code} > I1104 21:43:47.768609 31812 authenticator.cpp:98] Creating new server SASL > connection > I1104 21:43:47.768844 31812 authenticatee.cpp:213] Received SASL > authentication mechanisms: CRAM-MD5 > I1104 21:43:47.768874 31812 authenticatee.cpp:239] Attempting to authenticate > with mechanism 'CRAM-MD5' > I1104 21:43:47.768960 31812 authenticator.cpp:204] Received SASL > authentication start > I1104 21:43:47.769021 31812 authenticator.cpp:326] Authentication requires > more steps > I1104 21:43:47.769300 31812 authenticatee.cpp:259] Received SASL > authentication step > I1104 21:43:47.770079 31821 authenticator.cpp:232] Received SASL > authentication step > I1104 21:43:47.770108 31817 state.cpp:57] Recovering state from > '/tmp/HTTPCommandExecutorTest_TerminateWithACK_zuye4X/meta' > I1104 21:43:47.770146 31821 auxprop.cpp:109] Request to lookup properties for > user: 'test-principal' realm: 'b7fb1902101b' server FQDN: 'b7fb1902101b' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: false > I1104 21:43:47.770166 31821 auxprop.cpp:181] Looking up auxiliary property > '*userPassword' > I1104 21:43:47.770213 31821 auxprop.cpp:181] Looking up auxiliary property > '*cmusaslsecretCRAM-MD5' > I1104 21:43:47.770244 31821 auxprop.cpp:109] Request to lookup properties for > user: 'test-principal' realm: 'b7fb1902101b' server FQDN: 'b7fb1902101b' > SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false > SASL_AUXPROP_AUTHZID: true > I1104 21:43:47.770258 31821 auxprop.cpp:131] Skipping auxiliary property > '*userPassword' since SASL_AUXPROP_AUTHZID == true > I1104 21:43:47.770268 31821 auxprop.cpp:131] Skipping auxiliary property > '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true > I1104 21:43:47.770290 31821 authenticator.cpp:318] Authentication success > I1104 21:43:47.770388 31826 authenticatee.cpp:299] Authentication success > I1104 21:43:47.770429 31823 status_update_manager.cpp:203] Recovering status > update manager > I1104 21:43:47.770494 31820 master.cpp:6775] Successfully authenticated > principal 'test-principal' at > scheduler-3a4e1df5-e0b3-4373-937e-daf6914ba47d@172.17.0.3:50719 > I1104 21:43:47.770536 31815 authenticator.cpp:432] Authentication session > cleanup for crammd5-authenticatee(32)@172.17.0.3:50719 > I1104 21:43:47.771021 31814 sched.cpp:502] Successfully authenticated with > master master@172.17.0.3:50719 > I1104 21:43:47.771066 31814 sched.cpp:820] Sending SUBSCRIBE call to > master@172.17.0.3:50719 > I1104 21:43:47.771214 31814 sched.cpp:853] Will retry registration in > 384.083867ms if necessary > I1104 21:43:47.771524 31827 master.cpp:2612] Received SUBSCRIBE call for > framework 'default' at > scheduler-3a4e1df5-e0b3-4373-937e-daf6914ba47d@172.17.0.3:50719 > I1104 21:43:47.771775 31827 master.cpp:2069] Authorizing framework principal > 'test-principal' to receive offers for role '*' > I1104 21:43:47.772073 31822 containerizer.cpp:557] Recovering containerizer > I1104 21:43:47.772267 31827 master.cpp:2688] Subscribing framework default > with checkpointing disabled and capabilities [ ] > I1104 21:43:47.772873 31818 sched.cpp:743] Framework registered with > 266cf569-3b26-40cb-be6f-080699ef02a1- > I1104 21:43:47.772953 31824 hierarchical.cpp:275] Added framework > 266cf569-3b26-40cb-be6f-080699ef02a1- > I1104 21:43:47.773032 31818 sched.cpp:757] Scheduler::registered took 22098ns > I1104 21:43:47.773084 31824 hierarchical.cpp:1694] No allocations performed > I1104 21:43:47.773125 31824 hierarchical.cpp:1789] No inverse offers to send > out! > I1104 21:43:47.773246 31824 hierarchical.cpp:1286] Performed allocation for 0 > agents in 226006ns > I1104 21:43:47.773977 31821 provisioner.cpp:253] Provisioner recovery complete > I1104 21:43:47.774346 31814 slave.cpp:5399] Finished recovery > I1104 21:43:47.788095 31814 slave.cpp:5573] Querying resource estimator for > oversubscribable resources > I1104 21:43:47.788635 31821 status_update_manager.cpp:177] Pausing sending > status updates > I1104 21:43:47.788642 31814 slave.cpp:915] New master detected at > master@172.17.0.3:50719 > I1104
[jira] [Created] (MESOS-6724) The test "HTTPCommandExecutorTest.TerminateWithACK" is flaky
Qian Zhang created MESOS-6724: - Summary: The test "HTTPCommandExecutorTest.TerminateWithACK" is flaky Key: MESOS-6724 URL: https://issues.apache.org/jira/browse/MESOS-6724 Project: Mesos Issue Type: Bug Reporter: Qian Zhang It seems the test "HTTPCommandExecutorTest.TerminateWithACK" may fail when the machine is under heavy load (e.g., using the “stress” utility to generate load on the machine, like {{stress --cpu 4 --io 4 --timeout 120}}). {code} I1104 21:43:47.768609 31812 authenticator.cpp:98] Creating new server SASL connection I1104 21:43:47.768844 31812 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5 I1104 21:43:47.768874 31812 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5' I1104 21:43:47.768960 31812 authenticator.cpp:204] Received SASL authentication start I1104 21:43:47.769021 31812 authenticator.cpp:326] Authentication requires more steps I1104 21:43:47.769300 31812 authenticatee.cpp:259] Received SASL authentication step I1104 21:43:47.770079 31821 authenticator.cpp:232] Received SASL authentication step I1104 21:43:47.770108 31817 state.cpp:57] Recovering state from '/tmp/HTTPCommandExecutorTest_TerminateWithACK_zuye4X/meta' I1104 21:43:47.770146 31821 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'b7fb1902101b' server FQDN: 'b7fb1902101b' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false I1104 21:43:47.770166 31821 auxprop.cpp:181] Looking up auxiliary property '*userPassword' I1104 21:43:47.770213 31821 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5' I1104 21:43:47.770244 31821 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'b7fb1902101b' server FQDN: 'b7fb1902101b' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true I1104 21:43:47.770258 31821 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true I1104 21:43:47.770268 31821 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true I1104 21:43:47.770290 31821 authenticator.cpp:318] Authentication success I1104 21:43:47.770388 31826 authenticatee.cpp:299] Authentication success I1104 21:43:47.770429 31823 status_update_manager.cpp:203] Recovering status update manager I1104 21:43:47.770494 31820 master.cpp:6775] Successfully authenticated principal 'test-principal' at scheduler-3a4e1df5-e0b3-4373-937e-daf6914ba47d@172.17.0.3:50719 I1104 21:43:47.770536 31815 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(32)@172.17.0.3:50719 I1104 21:43:47.771021 31814 sched.cpp:502] Successfully authenticated with master master@172.17.0.3:50719 I1104 21:43:47.771066 31814 sched.cpp:820] Sending SUBSCRIBE call to master@172.17.0.3:50719 I1104 21:43:47.771214 31814 sched.cpp:853] Will retry registration in 384.083867ms if necessary I1104 21:43:47.771524 31827 master.cpp:2612] Received SUBSCRIBE call for framework 'default' at scheduler-3a4e1df5-e0b3-4373-937e-daf6914ba47d@172.17.0.3:50719 I1104 21:43:47.771775 31827 master.cpp:2069] Authorizing framework principal 'test-principal' to receive offers for role '*' I1104 21:43:47.772073 31822 containerizer.cpp:557] Recovering containerizer I1104 21:43:47.772267 31827 master.cpp:2688] Subscribing framework default with checkpointing disabled and capabilities [ ] I1104 21:43:47.772873 31818 sched.cpp:743] Framework registered with 266cf569-3b26-40cb-be6f-080699ef02a1- I1104 21:43:47.772953 31824 hierarchical.cpp:275] Added framework 266cf569-3b26-40cb-be6f-080699ef02a1- I1104 21:43:47.773032 31818 sched.cpp:757] Scheduler::registered took 22098ns I1104 21:43:47.773084 31824 hierarchical.cpp:1694] No allocations performed I1104 21:43:47.773125 31824 hierarchical.cpp:1789] No inverse offers to send out! I1104 21:43:47.773246 31824 hierarchical.cpp:1286] Performed allocation for 0 agents in 226006ns I1104 21:43:47.773977 31821 provisioner.cpp:253] Provisioner recovery complete I1104 21:43:47.774346 31814 slave.cpp:5399] Finished recovery I1104 21:43:47.788095 31814 slave.cpp:5573] Querying resource estimator for oversubscribable resources I1104 21:43:47.788635 31821 status_update_manager.cpp:177] Pausing sending status updates I1104 21:43:47.788642 31814 slave.cpp:915] New master detected at master@172.17.0.3:50719 I1104 21:43:47.788817 31814 slave.cpp:974] Authenticating with master master@172.17.0.3:50719 I1104 21:43:47.788998 31814 slave.cpp:985] Using default CRAM-MD5 authenticatee I1104 21:43:47.789353 31821 authenticatee.cpp:121] Creating new client SASL connection I1104 21:43:47.789775 31821 master.cpp:6745] Authenticating (1)@172.17.0.3:50719 I1104 21:43:47.789908 31818 authenticator.cpp:414]
[jira] [Updated] (MESOS-6719) Unify "active" and "state"/"connected" fields in Master::Framework
[ https://issues.apache.org/jira/browse/MESOS-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-6719: --- Shepherd: Vinod Kone > Unify "active" and "state"/"connected" fields in Master::Framework > -- > > Key: MESOS-6719 > URL: https://issues.apache.org/jira/browse/MESOS-6719 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Neil Conway >Assignee: Neil Conway >Priority: Minor > Labels: mesosphere > > Rather than tracking whether a framework is "active" separately from whether > it is "connected", we should consider using a single "state" variable to > track the current state of the framework (connected-and-active, > connected-and-inactive, disconnected, etc.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6723) Mesos fails to link using gold linker
[ https://issues.apache.org/jira/browse/MESOS-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-6723: --- Description: Configure flags: {noformat} ../mesos/configure --disable-java --disable-python CC="ccache gcc" CXX="ccache g++" CXXFLAGS=-fuse-ld=gold CFLAGS=-fuse-ld=gold {noformat} Compile output: {noformat} /bin/sh ../libtool --tag=CXX --mode=link ccache g++ -pthread -fuse-ld=gold -Wno-unused-local-typedefs -std=c++11 -Wl,--as-needed -o mesos-local local/mesos_local-main.o libmesos.la -lz -lsvn_delta-1 -lsvn_subr-1 -lsasl2 -lcurl -lapr-1 -lz -lrt -lunwind libtool: link: ccache g++ -pthread -fuse-ld=gold -Wno-unused-local-typedefs -std=c++11 -Wl,--as-needed -o .libs/mesos-local local/mesos_local-main.o ./.libs/libmesos.so -lpthread -lsvn_delta-1 -lsvn_subr-1 -lsasl2 -lcurl -lapr-1 -lz -lrt -lunwind -pthread -Wl,-rpath -Wl,/usr/local/lib ./.libs/libmesos.so: error: undefined reference to 'dlerror' ./.libs/libmesos.so: error: undefined reference to 'dlclose' ./.libs/libmesos.so: error: undefined reference to 'dlopen' ./.libs/libmesos.so: error: undefined reference to 'dlsym' collect2: error: ld returned 1 exit status make[2]: *** [Makefile:5139: mesos-local] Error 1 {noformat} was: Configure flags: {noformat} ../mesos/configure --disable-java --disable-python CC=ccache gcc CXX=ccache g++ CXXFLAGS=-fuse-ld=gold CFLAGS=-fuse-ld=gold {noformat} Compile output: {noformat} /bin/sh ../libtool --tag=CXX --mode=link ccache g++ -pthread -fuse-ld=gold -Wno-unused-local-typedefs -std=c++11 -Wl,--as-needed -o mesos-local local/mesos_local-main.o libmesos.la -lz -lsvn_delta-1 -lsvn_subr-1 -lsasl2 -lcurl -lapr-1 -lz -lrt -lunwind libtool: link: ccache g++ -pthread -fuse-ld=gold -Wno-unused-local-typedefs -std=c++11 -Wl,--as-needed -o .libs/mesos-local local/mesos_local-main.o ./.libs/libmesos.so -lpthread -lsvn_delta-1 -lsvn_subr-1 -lsasl2 -lcurl -lapr-1 -lz -lrt -lunwind -pthread -Wl,-rpath -Wl,/usr/local/lib ./.libs/libmesos.so: error: undefined reference to 'dlerror' ./.libs/libmesos.so: error: undefined reference to 'dlclose' ./.libs/libmesos.so: error: undefined reference to 'dlopen' ./.libs/libmesos.so: error: undefined reference to 'dlsym' collect2: error: ld returned 1 exit status make[2]: *** [Makefile:5139: mesos-local] Error 1 {noformat} > Mesos fails to link using gold linker > - > > Key: MESOS-6723 > URL: https://issues.apache.org/jira/browse/MESOS-6723 > Project: Mesos > Issue Type: Bug > Components: build > Environment: Arch Linux, amd64, GNU gold (GNU Binutils 2.27) 1.12 >Reporter: Neil Conway >Priority: Minor > Labels: mesosphere > > Configure flags: > {noformat} > ../mesos/configure --disable-java --disable-python CC="ccache gcc" > CXX="ccache g++" CXXFLAGS=-fuse-ld=gold CFLAGS=-fuse-ld=gold > {noformat} > Compile output: > {noformat} > /bin/sh ../libtool --tag=CXX --mode=link ccache g++ -pthread -fuse-ld=gold > -Wno-unused-local-typedefs -std=c++11 -Wl,--as-needed -o mesos-local > local/mesos_local-main.o libmesos.la -lz -lsvn_delta-1 -lsvn_subr-1 > -lsasl2 -lcurl -lapr-1 -lz -lrt -lunwind > libtool: link: ccache g++ -pthread -fuse-ld=gold -Wno-unused-local-typedefs > -std=c++11 -Wl,--as-needed -o .libs/mesos-local local/mesos_local-main.o > ./.libs/libmesos.so -lpthread -lsvn_delta-1 -lsvn_subr-1 -lsasl2 -lcurl > -lapr-1 -lz -lrt -lunwind -pthread -Wl,-rpath -Wl,/usr/local/lib > ./.libs/libmesos.so: error: undefined reference to 'dlerror' > ./.libs/libmesos.so: error: undefined reference to 'dlclose' > ./.libs/libmesos.so: error: undefined reference to 'dlopen' > ./.libs/libmesos.so: error: undefined reference to 'dlsym' > collect2: error: ld returned 1 exit status > make[2]: *** [Makefile:5139: mesos-local] Error 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6723) Mesos fails to link using gold linker
[ https://issues.apache.org/jira/browse/MESOS-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723841#comment-15723841 ] Neil Conway commented on MESOS-6723: This can be fixed by adding {{-ldl}} to {{LDADD}} in {{src/Makefile.am}}. There might be a more minimal/elegant fix, though... > Mesos fails to link using gold linker > - > > Key: MESOS-6723 > URL: https://issues.apache.org/jira/browse/MESOS-6723 > Project: Mesos > Issue Type: Bug > Components: build > Environment: Arch Linux, amd64, GNU gold (GNU Binutils 2.27) 1.12 >Reporter: Neil Conway >Priority: Minor > Labels: mesosphere > > Configure flags: > {noformat} > ../mesos/configure --disable-java --disable-python CC=ccache gcc CXX=ccache > g++ CXXFLAGS=-fuse-ld=gold CFLAGS=-fuse-ld=gold > {noformat} > Compile output: > {noformat} > /bin/sh ../libtool --tag=CXX --mode=link ccache g++ -pthread -fuse-ld=gold > -Wno-unused-local-typedefs -std=c++11 -Wl,--as-needed -o mesos-local > local/mesos_local-main.o libmesos.la -lz -lsvn_delta-1 -lsvn_subr-1 > -lsasl2 -lcurl -lapr-1 -lz -lrt -lunwind > libtool: link: ccache g++ -pthread -fuse-ld=gold -Wno-unused-local-typedefs > -std=c++11 -Wl,--as-needed -o .libs/mesos-local local/mesos_local-main.o > ./.libs/libmesos.so -lpthread -lsvn_delta-1 -lsvn_subr-1 -lsasl2 -lcurl > -lapr-1 -lz -lrt -lunwind -pthread -Wl,-rpath -Wl,/usr/local/lib > ./.libs/libmesos.so: error: undefined reference to 'dlerror' > ./.libs/libmesos.so: error: undefined reference to 'dlclose' > ./.libs/libmesos.so: error: undefined reference to 'dlopen' > ./.libs/libmesos.so: error: undefined reference to 'dlsym' > collect2: error: ld returned 1 exit status > make[2]: *** [Makefile:5139: mesos-local] Error 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6723) Mesos fails to link using gold linker
Neil Conway created MESOS-6723: -- Summary: Mesos fails to link using gold linker Key: MESOS-6723 URL: https://issues.apache.org/jira/browse/MESOS-6723 Project: Mesos Issue Type: Bug Components: build Environment: Arch Linux, amd64, GNU gold (GNU Binutils 2.27) 1.12 Reporter: Neil Conway Priority: Minor Configure flags: {noformat} ../mesos/configure --disable-java --disable-python CC=ccache gcc CXX=ccache g++ CXXFLAGS=-fuse-ld=gold CFLAGS=-fuse-ld=gold {noformat} Compile output: {noformat} /bin/sh ../libtool --tag=CXX --mode=link ccache g++ -pthread -fuse-ld=gold -Wno-unused-local-typedefs -std=c++11 -Wl,--as-needed -o mesos-local local/mesos_local-main.o libmesos.la -lz -lsvn_delta-1 -lsvn_subr-1 -lsasl2 -lcurl -lapr-1 -lz -lrt -lunwind libtool: link: ccache g++ -pthread -fuse-ld=gold -Wno-unused-local-typedefs -std=c++11 -Wl,--as-needed -o .libs/mesos-local local/mesos_local-main.o ./.libs/libmesos.so -lpthread -lsvn_delta-1 -lsvn_subr-1 -lsasl2 -lcurl -lapr-1 -lz -lrt -lunwind -pthread -Wl,-rpath -Wl,/usr/local/lib ./.libs/libmesos.so: error: undefined reference to 'dlerror' ./.libs/libmesos.so: error: undefined reference to 'dlclose' ./.libs/libmesos.so: error: undefined reference to 'dlopen' ./.libs/libmesos.so: error: undefined reference to 'dlsym' collect2: error: ld returned 1 exit status make[2]: *** [Makefile:5139: mesos-local] Error 1 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6427) Add documentation for rlimit support of Mesos containerizer
[ https://issues.apache.org/jira/browse/MESOS-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-6427: -- Story Points: 3 (was: 1) > Add documentation for rlimit support of Mesos containerizer > --- > > Key: MESOS-6427 > URL: https://issues.apache.org/jira/browse/MESOS-6427 > Project: Mesos > Issue Type: Improvement > Components: containerization, documentation >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > Fix For: 1.2.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6677) Error in Windows agent's Flags::runtime_dir CLI
[ https://issues.apache.org/jira/browse/MESOS-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-6677: Priority: Blocker (was: Minor) > Error in Windows agent's Flags::runtime_dir CLI > --- > > Key: MESOS-6677 > URL: https://issues.apache.org/jira/browse/MESOS-6677 > Project: Mesos > Issue Type: Bug > Components: cli >Affects Versions: 1.2.0 > Environment: Windows 10 > Commit b7937a68367088f3c1f7c334307422c71737b1d7 >Reporter: Andrew Schwartzmeyer >Assignee: Andrew Schwartzmeyer >Priority: Blocker > Labels: microsoft, newbie, windows > Original Estimate: 24h > Remaining Estimate: 24h > > Error occurs at runtime due to runtime_dir initialization code in the CLI. > Code attempts to get the current user's name (which throws an error on > Windows): > {{F1202 14:43:16.214303 9816 flags.cpp:215] CHECK_SOME(user): The request is > not supported.}} > {{*** Check failure stack trace: ***}} > While https://reviews.apache.org/r/53706/ implements {{os::user}}, the > default is still incorrect for Windows as it switches on the Linux user > "root", and would cause the hard-coded Linux default to be used in the edge > case of a user named "root" on Windows. > The fix is to use a proper Windows location for persistent data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3098) Implement WindowsContainerizer and WindowsDockerContainerizer
[ https://issues.apache.org/jira/browse/MESOS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-3098: Shepherd: Joseph Wu (was: Benjamin Hindman) > Implement WindowsContainerizer and WindowsDockerContainerizer > - > > Key: MESOS-3098 > URL: https://issues.apache.org/jira/browse/MESOS-3098 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Joseph Wu >Assignee: Daniel Pravat > Labels: mesosphere, microsoft > > The MVP for Windows support is a containerizer that (1) runs on Windows, and > (2) runs and passes all the tests that are relevant to the Windows platform > (_e.g._, not the tests that involve cgroups). To do this we require at least > a `WindowsContainerizer` (to be implemented alongside the > `MesosContainerizer`), which provides no meaningful (_e.g._) process > namespacing (much like the default unix containerizer). In the long term > (hopefully before MesosCon) we want to support also the Windows container > API. This will require implementing a separate containerizer, maybe called > `WindowsDockerContainerizer`. > Since the Windows container API is actually officially supported through the > Docker interface (_i.e._, MSFT actually ported the Docker engine to Windows, > and that is the official API), the interfaces (like the fetcher) shouldn't > change much. The tests probably will have to change, as we don't have access > to any isolation primitives like cgroups for those tests. > Outstanding TODO([~hausdorff]): Flesh out this description when more details > are available, regarding: > * The container API for Windows (when we know them) > * The nuances of Windows vs Linux (when we know them) > * etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3097) OS-specific code touched by the containerizer tests is not Windows compatible
[ https://issues.apache.org/jira/browse/MESOS-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-3097: Shepherd: Joseph Wu (was: Benjamin Hindman) > OS-specific code touched by the containerizer tests is not Windows compatible > - > > Key: MESOS-3097 > URL: https://issues.apache.org/jira/browse/MESOS-3097 > Project: Mesos > Issue Type: Story > Components: libprocess, stout >Reporter: Joseph Wu >Assignee: Daniel Pravat >Priority: Minor > Labels: mesosphere, microsoft > > In the process of adding the Cmake build system, [~hausdorff] noted and > stubbed out all OS-specific code. > That sweep (mostly of libprocess and stout) is here: > https://github.com/hausdorff/mesos/commit/b862f66c6ff58c115a009513621e5128cb734d52 > Instead of having inline {{#if defined(...)}}, the OS-specific code will be > separated into directories. > The Windows code will be stubbed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6717) Add Windows support to agent test harness
[ https://issues.apache.org/jira/browse/MESOS-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-6717: Shepherd: Joseph Wu > Add Windows support to agent test harness > - > > Key: MESOS-6717 > URL: https://issues.apache.org/jira/browse/MESOS-6717 > Project: Mesos > Issue Type: Bug > Components: agent >Reporter: Alex Clemmer >Assignee: Alex Clemmer > Labels: microsoft, windows-mvp > > Of particular interest is in `src/tests/CMakeLists.txt` is support enough of > the following that we can successfully run agent tests: > TEST_HELPER_SRC > MESOS_TESTS_UTILS_SRC -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6677) Error in Windows agent's Flags::runtime_dir CLI
[ https://issues.apache.org/jira/browse/MESOS-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Schwartzmeyer updated MESOS-6677: Labels: microsoft newbie windows (was: newbie windows) > Error in Windows agent's Flags::runtime_dir CLI > --- > > Key: MESOS-6677 > URL: https://issues.apache.org/jira/browse/MESOS-6677 > Project: Mesos > Issue Type: Bug > Components: cli >Affects Versions: 1.2.0 > Environment: Windows 10 > Commit b7937a68367088f3c1f7c334307422c71737b1d7 >Reporter: Andrew Schwartzmeyer >Assignee: Andrew Schwartzmeyer >Priority: Minor > Labels: microsoft, newbie, windows > Original Estimate: 24h > Remaining Estimate: 24h > > Error occurs at runtime due to runtime_dir initialization code in the CLI. > Code attempts to get the current user's name (which throws an error on > Windows): > {{F1202 14:43:16.214303 9816 flags.cpp:215] CHECK_SOME(user): The request is > not supported.}} > {{*** Check failure stack trace: ***}} > While https://reviews.apache.org/r/53706/ implements {{os::user}}, the > default is still incorrect for Windows as it switches on the Linux user > "root", and would cause the hard-coded Linux default to be used in the edge > case of a user named "root" on Windows. > The fix is to use a proper Windows location for persistent data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6677) Error in Windows agent's Flags::runtime_dir CLI
[ https://issues.apache.org/jira/browse/MESOS-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Schwartzmeyer updated MESOS-6677: Labels: newbie windows (was: newbie) > Error in Windows agent's Flags::runtime_dir CLI > --- > > Key: MESOS-6677 > URL: https://issues.apache.org/jira/browse/MESOS-6677 > Project: Mesos > Issue Type: Bug > Components: cli >Affects Versions: 1.2.0 > Environment: Windows 10 > Commit b7937a68367088f3c1f7c334307422c71737b1d7 >Reporter: Andrew Schwartzmeyer >Assignee: Andrew Schwartzmeyer >Priority: Minor > Labels: newbie, windows > Original Estimate: 24h > Remaining Estimate: 24h > > Error occurs at runtime due to runtime_dir initialization code in the CLI. > Code attempts to get the current user's name (which throws an error on > Windows): > {{F1202 14:43:16.214303 9816 flags.cpp:215] CHECK_SOME(user): The request is > not supported.}} > {{*** Check failure stack trace: ***}} > While https://reviews.apache.org/r/53706/ implements {{os::user}}, the > default is still incorrect for Windows as it switches on the Linux user > "root", and would cause the hard-coded Linux default to be used in the edge > case of a user named "root" on Windows. > The fix is to use a proper Windows location for persistent data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6722) Agent tries to use POSIX paths for the variable data runtime directory.
Alex Clemmer created MESOS-6722: --- Summary: Agent tries to use POSIX paths for the variable data runtime directory. Key: MESOS-6722 URL: https://issues.apache.org/jira/browse/MESOS-6722 Project: Mesos Issue Type: Bug Components: agent Reporter: Alex Clemmer Assignee: Andrew Schwartzmeyer When the agent boots up, it checks `os::user` in an attempt to set the `runtime_dir`. Essentially, if the user is `root`, it will put this directory somewhere in `/var`, and if not, somewhere in `/temp`. This is mainly to avoid permissions issues. Since `os::user` is not implemented on Windows, this crashes us and prevents connecting to the master. We should change this to set the directory intelligently on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6689) Remove of unix domain socket path in IOSwitchboard::cleanup
[ https://issues.apache.org/jira/browse/MESOS-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723451#comment-15723451 ] Jie Yu commented on MESOS-6689: --- commit 3ad06661847a8c38ee9bfe9c7f593873f4769cbe Author: Kevin KluesDate: Mon Dec 5 13:41:25 2016 -0800 Added removal of unix domain socket path in IOSwitchboard::cleanup. Review: https://reviews.apache.org/r/54352/ > Remove of unix domain socket path in IOSwitchboard::cleanup > --- > > Key: MESOS-6689 > URL: https://issues.apache.org/jira/browse/MESOS-6689 > Project: Mesos > Issue Type: Bug >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: debugging, mesosphere > Fix For: 1.2.0 > > > We currently leak all of the unix domain socket files created by the > switchboard in the `/tmp` directory. We need to clean them up properly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3097) OS-specific code touched by the containerizer tests is not Windows compatible
[ https://issues.apache.org/jira/browse/MESOS-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-3097: Assignee: Daniel Pravat (was: Alex Clemmer) > OS-specific code touched by the containerizer tests is not Windows compatible > - > > Key: MESOS-3097 > URL: https://issues.apache.org/jira/browse/MESOS-3097 > Project: Mesos > Issue Type: Story > Components: libprocess, stout >Reporter: Joseph Wu >Assignee: Daniel Pravat >Priority: Minor > Labels: mesosphere, microsoft > > In the process of adding the Cmake build system, [~hausdorff] noted and > stubbed out all OS-specific code. > That sweep (mostly of libprocess and stout) is here: > https://github.com/hausdorff/mesos/commit/b862f66c6ff58c115a009513621e5128cb734d52 > Instead of having inline {{#if defined(...)}}, the OS-specific code will be > separated into directories. > The Windows code will be stubbed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3098) Implement WindowsContainerizer and WindowsDockerContainerizer
[ https://issues.apache.org/jira/browse/MESOS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-3098: Assignee: Daniel Pravat (was: Alex Clemmer) > Implement WindowsContainerizer and WindowsDockerContainerizer > - > > Key: MESOS-3098 > URL: https://issues.apache.org/jira/browse/MESOS-3098 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Joseph Wu >Assignee: Daniel Pravat > Labels: mesosphere, microsoft > > The MVP for Windows support is a containerizer that (1) runs on Windows, and > (2) runs and passes all the tests that are relevant to the Windows platform > (_e.g._, not the tests that involve cgroups). To do this we require at least > a `WindowsContainerizer` (to be implemented alongside the > `MesosContainerizer`), which provides no meaningful (_e.g._) process > namespacing (much like the default unix containerizer). In the long term > (hopefully before MesosCon) we want to support also the Windows container > API. This will require implementing a separate containerizer, maybe called > `WindowsDockerContainerizer`. > Since the Windows container API is actually officially supported through the > Docker interface (_i.e._, MSFT actually ported the Docker engine to Windows, > and that is the official API), the interfaces (like the fetcher) shouldn't > change much. The tests probably will have to change, as we don't have access > to any isolation primitives like cgroups for those tests. > Outstanding TODO([~hausdorff]): Flesh out this description when more details > are available, regarding: > * The container API for Windows (when we know them) > * The nuances of Windows vs Linux (when we know them) > * etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6674) Add Python sources to the CMake build
[ https://issues.apache.org/jira/browse/MESOS-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-6674: Assignee: Srinivas (was: Alex Clemmer) > Add Python sources to the CMake build > - > > Key: MESOS-6674 > URL: https://issues.apache.org/jira/browse/MESOS-6674 > Project: Mesos > Issue Type: Task > Components: build, cmake >Reporter: Joseph Wu >Assignee: Srinivas > Labels: microsoft > > The Python portion of the build includes a scheduler and executor driver as > well as Mesos protobufs. Eventually, there will also be a CLI component as > well. > See the automake sources for more details. i.e. > https://github.com/apache/mesos/blob/2a73d956af1cb0615d4e66de126ab554fdabb0b5/src/Makefile.am#L1726-L1752 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5902) CMake should generate protobuf definitions for Java
[ https://issues.apache.org/jira/browse/MESOS-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-5902: Assignee: Srinivas (was: Alex Clemmer) > CMake should generate protobuf definitions for Java > --- > > Key: MESOS-5902 > URL: https://issues.apache.org/jira/browse/MESOS-5902 > Project: Mesos > Issue Type: Improvement > Components: build > Environment: CMake >Reporter: Srinivas >Assignee: Srinivas > Labels: microsoft > > Currently Java protobuf bindings require java protobuf library to generate > and compile the sources. We should build protobuf-java-2.6.1.jar from the > protobuf sources just like we build the mesos protobuf for C++. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6721) Cause source files to be correctly grouped into folders
Alex Clemmer created MESOS-6721: --- Summary: Cause source files to be correctly grouped into folders Key: MESOS-6721 URL: https://issues.apache.org/jira/browse/MESOS-6721 Project: Mesos Issue Type: Bug Components: cmake Reporter: Alex Clemmer Assignee: Alex Clemmer CMake has good facilities for organizing source files in a project into folders, but we don't really make use of them. This is especially bad for IDEs like XCode and Visual Studio, where the source files will just end up in a folder with literally everything that's included. For every executable and library we make, we should do something like this (and it might be wrong, because my memory is hazy here): ``` set_property(TARGET ${AGENT_TARGET} PROPERTY FOLDER "src/slave") ``` -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6597) Include missing Mesos Java classes for Protobuf files to support Operator HTTP V1 API
[ https://issues.apache.org/jira/browse/MESOS-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723128#comment-15723128 ] Anand Mazumdar commented on MESOS-6597: --- {noformat} commit 5abda76d697dcc21e64f9037b03c3a15fc434286 Author: Vijay SrinivasaraghavanDate: Mon Dec 5 08:58:05 2016 -0800 Enabled python proto generation for v1 Master/Agent API. The correspondng master/agent protos are now included in the generated Mesos pypi package. Review: https://reviews.apache.org/r/54015/ commit e1ae5cf8030821e1527466e84a0dfe1864406926 Author: Vijay Srinivasaraghavan Date: Mon Dec 5 08:58:00 2016 -0800 Enabled java protos generation for v1 Master/Agent API. The corresponding master/agent protos are now included in the generated Mesos JAR. Review: https://reviews.apache.org/r/53825/ commit 2786ef6e1b7c91ca68ef4c584d8b4316fe2d6a58 Author: Vijay Srinivasaraghavan Date: Mon Dec 5 08:57:55 2016 -0800 Fixed missing protobuf java package/classname definition. Review: https://reviews.apache.org/r/54014/ {noformat} > Include missing Mesos Java classes for Protobuf files to support Operator > HTTP V1 API > - > > Key: MESOS-6597 > URL: https://issues.apache.org/jira/browse/MESOS-6597 > Project: Mesos > Issue Type: Bug > Components: build >Reporter: Vijay Srinivasaraghavan >Assignee: Vijay Srinivasaraghavan >Priority: Blocker > > For V1 API support, the build file that generates Java protos wrapper as of > now includes only executor and scheduler. > (https://github.com/apache/mesos/blob/master/src/Makefile.am#L334) > To support operator HTTP API, we also need to generate java protos for > additional proto definitions like quota, maintenance etc., These java > definition files will be used by a standard Rest client when using the > straight HTTP API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6597) Include missing Mesos Java classes for Protobuf files to support Operator HTTP V1 API
[ https://issues.apache.org/jira/browse/MESOS-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723131#comment-15723131 ] Anand Mazumdar commented on MESOS-6597: --- Keeping the issue open while I do the back-port to 1.1.x. > Include missing Mesos Java classes for Protobuf files to support Operator > HTTP V1 API > - > > Key: MESOS-6597 > URL: https://issues.apache.org/jira/browse/MESOS-6597 > Project: Mesos > Issue Type: Bug > Components: build >Reporter: Vijay Srinivasaraghavan >Assignee: Vijay Srinivasaraghavan >Priority: Blocker > > For V1 API support, the build file that generates Java protos wrapper as of > now includes only executor and scheduler. > (https://github.com/apache/mesos/blob/master/src/Makefile.am#L334) > To support operator HTTP API, we also need to generate java protos for > additional proto definitions like quota, maintenance etc., These java > definition files will be used by a standard Rest client when using the > straight HTTP API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6720) Check that `PreferredToolArchitecture` is set to `x64` on Windows before building
Alex Clemmer created MESOS-6720: --- Summary: Check that `PreferredToolArchitecture` is set to `x64` on Windows before building Key: MESOS-6720 URL: https://issues.apache.org/jira/browse/MESOS-6720 Project: Mesos Issue Type: Bug Components: cmake Reporter: Alex Clemmer Assignee: Alex Clemmer If this variable is not set before we build, it will cause the linker to occasionally hang forever, due to a MSVC toolchain bug in the linker. We should make this easy on developers and check for them. If the variable is not set, we should display an error message explaining. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4445) Labels equality behavior is wrong
[ https://issues.apache.org/jira/browse/MESOS-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-4445: --- Assignee: (was: Neil Conway) > Labels equality behavior is wrong > - > > Key: MESOS-4445 > URL: https://issues.apache.org/jira/browse/MESOS-4445 > Project: Mesos > Issue Type: Bug > Components: general >Reporter: Neil Conway >Priority: Minor > Labels: labels, mesosphere > > {noformat} > TEST(RevocableResourceTest, LabelSemantics) > { > Labels labels1; > Labels labels2; > labels1.add_labels()->CopyFrom(createLabel("foo", "bar")); > labels1.add_labels()->CopyFrom(createLabel("foo", "bar")); > labels2.add_labels()->CopyFrom(createLabel("foo", "bar")); > labels2.add_labels()->CopyFrom(createLabel("baz", "qux")); > bool eq = (labels1 == labels2); > LOG(INFO) << "Equal? " << (eq ? "true" : "false"); > } > {noformat} > Output: > {noformat} > [ RUN ] RevocableResourceTest.LabelSemantics > I0120 13:15:25.207223 2078158848 resources_tests.cpp:1990] Equal? true > [ OK ] RevocableResourceTest.LabelSemantics (0 ms) > {noformat} > This behavior seems pretty problematic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4732) Migrate rest of the endpoints to use `jsonify`
[ https://issues.apache.org/jira/browse/MESOS-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-4732: --- Assignee: (was: Neil Conway) > Migrate rest of the endpoints to use `jsonify` > -- > > Key: MESOS-4732 > URL: https://issues.apache.org/jira/browse/MESOS-4732 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Michael Park > > As MVP, we shipped `/state` and `/state-summary` to use `jsonify`. We need to > follow through with the migration of the rest of the endpoints to use > `jsonify` as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6719) Unify "active" and "state"/"connected" fields in Master::Framework
Neil Conway created MESOS-6719: -- Summary: Unify "active" and "state"/"connected" fields in Master::Framework Key: MESOS-6719 URL: https://issues.apache.org/jira/browse/MESOS-6719 Project: Mesos Issue Type: Improvement Components: master Reporter: Neil Conway Assignee: Neil Conway Priority: Minor Rather than tracking whether a framework is "active" separately from whether it is "connected", we should consider using a single "state" variable to track the current state of the framework (connected-and-active, connected-and-inactive, disconnected, etc.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6496) Support construction of Shared and Owned from managed Derived*
[ https://issues.apache.org/jira/browse/MESOS-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-6496: --- Assignee: (was: Neil Conway) > Support construction of Shared and Owned from managed Derived* > -- > > Key: MESOS-6496 > URL: https://issues.apache.org/jira/browse/MESOS-6496 > Project: Mesos > Issue Type: Bug > Components: libprocess >Reporter: Neil Conway > Labels: mesosphere, tech-debt > > It should be possible to pass a {{Shared}} value to an object that > takes a parameter of type {{Shared}}. Similarly for {{Owned}}. In > general, {{Shared}} should be implicitly convertable to {{Shared}} > iff {{T2*}} is implicitly convertable to {{T1*}}. In C++11, this works > because they define the appropriate conversion constructor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6668) can't fetch uris
[ https://issues.apache.org/jira/browse/MESOS-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722389#comment-15722389 ] haosdent commented on MESOS-6668: - I could reproduce, and it looks like nginx download site problem. If I change to use these two urls, the problem is gone. {code} https://openresty.org/download/openresty-1.9.7.5.tar.gz https://openresty.org/download/openresty-1.11.2.1.tar.gz {code} > can't fetch uris > > > Key: MESOS-6668 > URL: https://issues.apache.org/jira/browse/MESOS-6668 > Project: Mesos > Issue Type: Bug > Components: fetcher >Affects Versions: 1.1.0 >Reporter: pwzgorilla >Assignee: haosdent >Priority: Minor > > when with two uris: "https://nginx.org/download/nginx-1.6.3.tar.gz; > "https://nginx.org/download/nginx-1.8.1.tar.gz;, sometimes it will be > successful fetched, but most of time failed! > {code} > I1202 11:38:37.758714 1959038976 fetcher.cpp:498] Fetcher Info: > {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/dfad11fe-c83a-40aa-abc5-6390a7615545-S3","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"https:\/\/nginx.org\/download\/nginx-1.6.3.tar.gz"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"https:\/\/nginx.org\/download\/nginx-1.8.1.tar.gz"}}],"sandbox_directory":"\/mesos\/slaves\/dfad11fe-c83a-40aa-abc5-6390a7615545-S3\/frameworks\/dfad11fe-c83a-40aa-abc5-6390a7615545-0039\/executors\/1480649917664199938-0.0001.defaultGroup.Unnamed\/runs\/037e56bc-8416-49c9-bc72-bb43c87a97d7"} > I1202 11:38:37.764912 1959038976 fetcher.cpp:409] Fetching URI > 'https://nginx.org/download/nginx-1.6.3.tar.gz' > I1202 11:38:37.764940 1959038976 fetcher.cpp:250] Fetching directly into the > sandbox directory > I1202 11:38:37.764974 1959038976 fetcher.cpp:187] Fetching URI > 'https://nginx.org/download/nginx-1.6.3.tar.gz' > I1202 11:38:37.764999 1959038976 fetcher.cpp:134] Downloading resource from > 'https://nginx.org/download/nginx-1.6.3.tar.gz' to > '/mesos/slaves/dfad11fe-c83a-40aa-abc5-6390a7615545-S3/frameworks/dfad11fe-c83a-40aa-abc5-6390a7615545-0039/executors/1480649917664199938-0.0001.defaultGroup.Unnamed/runs/037e56bc-8416-49c9-bc72-bb43c87a97d7/nginx-1.6.3.tar.gz' > I1202 11:39:07.293943 1959038976 fetcher.cpp:84] Extracting with command: tar > -C > '/mesos/slaves/dfad11fe-c83a-40aa-abc5-6390a7615545-S3/frameworks/dfad11fe-c83a-40aa-abc5-6390a7615545-0039/executors/1480649917664199938-0.0001.defaultGroup.Unnamed/runs/037e56bc-8416-49c9-bc72-bb43c87a97d7' > -xf > '/mesos/slaves/dfad11fe-c83a-40aa-abc5-6390a7615545-S3/frameworks/dfad11fe-c83a-40aa-abc5-6390a7615545-0039/executors/1480649917664199938-0.0001.defaultGroup.Unnamed/runs/037e56bc-8416-49c9-bc72-bb43c87a97d7/nginx-1.6.3.tar.gz' > I1202 11:39:07.385437 1959038976 fetcher.cpp:92] Extracted > '/mesos/slaves/dfad11fe-c83a-40aa-abc5-6390a7615545-S3/frameworks/dfad11fe-c83a-40aa-abc5-6390a7615545-0039/executors/1480649917664199938-0.0001.defaultGroup.Unnamed/runs/037e56bc-8416-49c9-bc72-bb43c87a97d7/nginx-1.6.3.tar.gz' > into > '/mesos/slaves/dfad11fe-c83a-40aa-abc5-6390a7615545-S3/frameworks/dfad11fe-c83a-40aa-abc5-6390a7615545-0039/executors/1480649917664199938-0.0001.defaultGroup.Unnamed/runs/037e56bc-8416-49c9-bc72-bb43c87a97d7' > I1202 11:39:07.385507 1959038976 fetcher.cpp:547] Fetched > 'https://nginx.org/download/nginx-1.6.3.tar.gz' to > '/mesos/slaves/dfad11fe-c83a-40aa-abc5-6390a7615545-S3/frameworks/dfad11fe-c83a-40aa-abc5-6390a7615545-0039/executors/1480649917664199938-0.0001.defaultGroup.Unnamed/runs/037e56bc-8416-49c9-bc72-bb43c87a97d7/nginx-1.6.3.tar.gz' > I1202 11:39:07.385517 1959038976 fetcher.cpp:409] Fetching URI > 'https://nginx.org/download/nginx-1.8.1.tar.gz' > I1202 11:39:07.385524 1959038976 fetcher.cpp:250] Fetching directly into the > sandbox directory > I1202 11:39:07.385546 1959038976 fetcher.cpp:187] Fetching URI > 'https://nginx.org/download/nginx-1.8.1.tar.gz' > I1202 11:39:07.385560 1959038976 fetcher.cpp:134] Downloading resource from > 'https://nginx.org/download/nginx-1.8.1.tar.gz' to > '/mesos/slaves/dfad11fe-c83a-40aa-abc5-6390a7615545-S3/frameworks/dfad11fe-c83a-40aa-abc5-6390a7615545-0039/executors/1480649917664199938-0.0001.defaultGroup.Unnamed/runs/037e56bc-8416-49c9-bc72-bb43c87a97d7/nginx-1.8.1.tar.gz' > End fetcher log for container 037e56bc-8416-49c9-bc72-bb43c87a97d7 > E1202 11:39:37.843308 3211264 fetcher.cpp:568] Failed to run mesos-fetcher: > Failed to fetch all URIs for container '037e56bc-8416-49c9-bc72-bb43c87a97d7' > with exit status: 9 > E1202 11:39:37.843600 1601536 slave.cpp:4423] Container > '037e56bc-8416-49c9-bc72-bb43c87a97d7' for executor > '1480649917664199938-0.0001.defaultGroup.Unnamed' of framework > dfad11fe-c83a-40aa-abc5-6390a7615545-0039 failed to start: Failed to fetch > all URIs for container
[jira] [Comment Edited] (MESOS-1509) Use Content-Disposition filename (if available) when downloading HTTP URIs
[ https://issues.apache.org/jira/browse/MESOS-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721914#comment-15721914 ] Yennick Trevels edited comment on MESOS-1509 at 12/5/16 10:43 AM: -- I can confirm that Mesos currently disregards the "Content-Disposition" header. *My scenario* I tried submitting a spark application to DC/OS where the Jar was hosted on S3. This required a S3 presigned (Signature V4) url, of which the url parameters can get pretty long. sample presigned S3 url: {code} https://s3.eu-central-1.amazonaws.com/bucketA/spark-app.jar?X-Amz-Algorithm=AWS4-HMAC-SHA256=3600=XX=host=20161205T095356Z=X {code} When trying to run the Spark app on DC/OS it immediately failed with the following error: {code} Failed to fetch 'https://s3.eu-central-1.amazonaws.com/bucketA/spark-app.jar?X-Amz-Algorithm=AWS4-HMAC-SHA256=3600=XX=host=20161205T095356Z=X': Error downloading resource: File name too long {code} Then I tried uploading the file to S3 with a Content-Disposition header: {code} attachment; filename='spark-app-alt.jar'; filename*=UTF-8''spark%2Dapp%2Dalt%2Ejar {code} The content disposition header is properly displayed in the AWS Console (so nothing wrong there) and when downloading the file via a browser with a presigned S3 url it gets properly saved with the filename specified in the content-disposition header. However, when the Mesos Fetcher tries to download the file, it still fails with the "file name too long" error. was (Author: slevinbe): I can confirm that Mesos currently disregards the "Content-Disposition" header. *My scenario* I tried submitting a spark application to DC/OS where the Jar was hosted on S3. This required a S3 presigned (Signature V4) url, of which the url parameters can get pretty long. sample presigned S3 url: ``` https://s3.eu-central-1.amazonaws.com/bucketA/spark-app.jar?X-Amz-Algorithm=AWS4-HMAC-SHA256=3600=XX=host=20161205T095356Z=X ``` When trying to run the Spark app on DC/OS it immediately failed with the following error: ``` Failed to fetch 'https://s3.eu-central-1.amazonaws.com/bucketA/spark-app.jar?X-Amz-Algorithm=AWS4-HMAC-SHA256=3600=XX=host=20161205T095356Z=X': Error downloading resource: File name too long ``` Then I tried uploading the file to S3 with a Content-Disposition header: ``` attachment; filename='spark-app-alt.jar'; filename*=UTF-8''spark%2Dapp%2Dalt%2Ejar ``` The content disposition header is properly displayed in the AWS Console (so nothing wrong there) and when downloading the file via a browser with a presigned S3 url it gets properly saved with the filename specified in the content-disposition header. However, when the Mesos Fetcher tries to download the file, it still fails with the "file name too long" error. > Use Content-Disposition filename (if available) when downloading HTTP URIs > -- > > Key: MESOS-1509 > URL: https://issues.apache.org/jira/browse/MESOS-1509 > Project: Mesos > Issue Type: Improvement > Components: agent >Affects Versions: 0.18.0, 0.19.0, 0.20.0, 0.21.0 > Environment: Linux (but should be irrelevant) >Reporter: Bjoern Metzdorf >Priority: Minor > > Currently the slave stores downloaded HTTP URIs in filenames that are made up > from the part after the last "/" in the URI (in src/launcher/fetcher.cpp:122): > {code} > path = path::join(directory, path.substr(path.find_last_of("/") + 1)); > {code} > The problem is that the query string is included in the filename and a URI > like {{http://my.web.server/dynamic/resource.tar.gz?a=b}} results in a > downloaded file named {{resource.tar.gz?a=b}}. > The curl maintainers faced the same problem and added this: > {code} > -J, --remote-header-name > (HTTP) This option tells the -O, --remote-name option to use > the server-specified Content-Disposition filename instead of extracting a > filename from the URL. > {code} > Maybe Mesos could do the same if a Content-Disposition header exists. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1509) Use Content-Disposition filename (if available) when downloading HTTP URIs
[ https://issues.apache.org/jira/browse/MESOS-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721914#comment-15721914 ] Yennick Trevels commented on MESOS-1509: I can confirm that Mesos currently disregards the "Content-Disposition" header. *My scenario* I tried submitting a spark application to DC/OS where the Jar was hosted on S3. This required a S3 presigned (Signature V4) url, of which the url parameters can get pretty long. sample presigned S3 url: ``` https://s3.eu-central-1.amazonaws.com/bucketA/spark-app.jar?X-Amz-Algorithm=AWS4-HMAC-SHA256=3600=XX=host=20161205T095356Z=X ``` When trying to run the Spark app on DC/OS it immediately failed with the following error: ``` Failed to fetch 'https://s3.eu-central-1.amazonaws.com/bucketA/spark-app.jar?X-Amz-Algorithm=AWS4-HMAC-SHA256=3600=XX=host=20161205T095356Z=X': Error downloading resource: File name too long ``` Then I tried uploading the file to S3 with a Content-Disposition header: ``` attachment; filename='spark-app-alt.jar'; filename*=UTF-8''spark%2Dapp%2Dalt%2Ejar ``` The content disposition header is properly displayed in the AWS Console (so nothing wrong there) and when downloading the file via a browser with a presigned S3 url it gets properly saved with the filename specified in the content-disposition header. However, when the Mesos Fetcher tries to download the file, it still fails with the "file name too long" error. > Use Content-Disposition filename (if available) when downloading HTTP URIs > -- > > Key: MESOS-1509 > URL: https://issues.apache.org/jira/browse/MESOS-1509 > Project: Mesos > Issue Type: Improvement > Components: agent >Affects Versions: 0.18.0, 0.19.0, 0.20.0, 0.21.0 > Environment: Linux (but should be irrelevant) >Reporter: Bjoern Metzdorf >Priority: Minor > > Currently the slave stores downloaded HTTP URIs in filenames that are made up > from the part after the last "/" in the URI (in src/launcher/fetcher.cpp:122): > {code} > path = path::join(directory, path.substr(path.find_last_of("/") + 1)); > {code} > The problem is that the query string is included in the filename and a URI > like {{http://my.web.server/dynamic/resource.tar.gz?a=b}} results in a > downloaded file named {{resource.tar.gz?a=b}}. > The curl maintainers faced the same problem and added this: > {code} > -J, --remote-header-name > (HTTP) This option tells the -O, --remote-name option to use > the server-specified Content-Disposition filename instead of extracting a > filename from the URL. > {code} > Maybe Mesos could do the same if a Content-Disposition header exists. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1582) Improve build time.
[ https://issues.apache.org/jira/browse/MESOS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-1582: Labels: microsoft (was: ) > Improve build time. > --- > > Key: MESOS-1582 > URL: https://issues.apache.org/jira/browse/MESOS-1582 > Project: Mesos > Issue Type: Epic > Components: build >Reporter: Benjamin Hindman > Labels: microsoft > > The build takes a ridiculously long time unless you have a large, parallel > machine. This is a combination of many factors, all of which we'd like to > discuss and track here. > I'd also love to actually track build times so we can get an appreciation of > the improvements. Please leave a comment below with your build times! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5548) FaultToleranceTest.UpdateFrameworkInfoOnSchedulerFailover is flaky
[ https://issues.apache.org/jira/browse/MESOS-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-5548: Labels: flaky-tests (was: flaky) > FaultToleranceTest.UpdateFrameworkInfoOnSchedulerFailover is flaky > -- > > Key: MESOS-5548 > URL: https://issues.apache.org/jira/browse/MESOS-5548 > Project: Mesos > Issue Type: Bug > Components: flaky, test > Environment: > https://builds.apache.org/job/Mesos/2223/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos:7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/consoleFull >Reporter: Vinod Kone > Labels: flaky-tests > > Observed this on ASF CI, where this test runs forever > {code} > [ RUN ] FaultToleranceTest.UpdateFrameworkInfoOnSchedulerFailover > I0606 17:09:02.953631 29338 cluster.cpp:155] Creating default 'local' > authorizer > I0606 17:09:02.957620 29338 leveldb.cpp:174] Opened db in 3.247876ms > I0606 17:09:02.958684 29338 leveldb.cpp:181] Compacted db in 1.023058ms > I0606 17:09:02.958762 29338 leveldb.cpp:196] Created db iterator in 17962ns > I0606 17:09:02.958794 29338 leveldb.cpp:202] Seeked to beginning of db in > 2453ns > I0606 17:09:02.958820 29338 leveldb.cpp:271] Iterated through 0 keys in the > db in 465ns > I0606 17:09:02.958880 29338 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0606 17:09:02.959601 29362 recover.cpp:448] Starting replica recovery > I0606 17:09:02.959996 29362 recover.cpp:474] Replica is in EMPTY status > I0606 17:09:02.961241 29357 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (1793)@172.17.0.2:39784 > I0606 17:09:02.961608 29369 recover.cpp:194] Received a recover response from > a replica in EMPTY status > I0606 17:09:02.962347 29357 recover.cpp:565] Updating replica status to > STARTING > I0606 17:09:02.963209 29371 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 706012ns > I0606 17:09:02.963240 29371 replica.cpp:320] Persisted replica status to > STARTING > I0606 17:09:02.963435 29370 recover.cpp:474] Replica is in STARTING status > I0606 17:09:02.963881 29360 master.cpp:382] Master > 7d25f96c-ab51-4074-b613-df1b15675b81 (1ead8e6f9ec5) started on > 172.17.0.2:39784 > I0606 17:09:02.963907 29360 master.cpp:384] Flags at startup: --acls="" > --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate_agents="true" --authenticate_frameworks="true" > --authenticate_http="true" --authenticate_http_frameworks="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/TzbJnN/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --http_framework_authenticators="basic" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_agent_ping_timeouts="5" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --quiet="false" > --recovery_agent_removal_limit="100%" --registry="replicated_log" > --registry_fetch_timeout="1mins" --registry_store_timeout="100secs" > --registry_strict="true" --root_submissions="true" --user_sorter="drf" > --version="false" --webui_dir="/mesos/mesos-1.0.0/_inst/share/mesos/webui" > --work_dir="/tmp/TzbJnN/master" --zk_session_timeout="10secs" > I0606 17:09:02.964334 29360 master.cpp:433] Master only allowing > authenticated frameworks to register > I0606 17:09:02.964355 29360 master.cpp:439] Master only allowing > authenticated agents to register > I0606 17:09:02.964366 29360 master.cpp:445] Master only allowing > authenticated HTTP frameworks to register > I0606 17:09:02.964376 29360 credentials.hpp:37] Loading credentials for > authentication from '/tmp/TzbJnN/credentials' > I0606 17:09:02.964445 29372 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (1794)@172.17.0.2:39784 > I0606 17:09:02.964728 29360 master.cpp:489] Using default 'crammd5' > authenticator > I0606 17:09:02.964891 29360 master.cpp:560] Using default 'basic' HTTP > authenticator > I0606 17:09:02.964926 29361 recover.cpp:194] Received a recover response from > a replica in STARTING status > I0606 17:09:02.965065 29360 master.cpp:640] Using default 'basic' HTTP > framework authenticator > I0606 17:09:02.965330 29360 master.cpp:687] Authorization enabled > I0606 17:09:02.965418 29362 recover.cpp:565] Updating replica status to VOTING > I0606 17:09:02.965591 29359 hierarchical.cpp:142] Initialized hierarchical > allocator process > I0606 17:09:02.965672 29367
[jira] [Updated] (MESOS-5548) FaultToleranceTest.UpdateFrameworkInfoOnSchedulerFailover is flaky
[ https://issues.apache.org/jira/browse/MESOS-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-5548: Labels: flaky (was: ) > FaultToleranceTest.UpdateFrameworkInfoOnSchedulerFailover is flaky > -- > > Key: MESOS-5548 > URL: https://issues.apache.org/jira/browse/MESOS-5548 > Project: Mesos > Issue Type: Bug > Components: flaky, test > Environment: > https://builds.apache.org/job/Mesos/2223/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos:7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/consoleFull >Reporter: Vinod Kone > Labels: flaky > > Observed this on ASF CI, where this test runs forever > {code} > [ RUN ] FaultToleranceTest.UpdateFrameworkInfoOnSchedulerFailover > I0606 17:09:02.953631 29338 cluster.cpp:155] Creating default 'local' > authorizer > I0606 17:09:02.957620 29338 leveldb.cpp:174] Opened db in 3.247876ms > I0606 17:09:02.958684 29338 leveldb.cpp:181] Compacted db in 1.023058ms > I0606 17:09:02.958762 29338 leveldb.cpp:196] Created db iterator in 17962ns > I0606 17:09:02.958794 29338 leveldb.cpp:202] Seeked to beginning of db in > 2453ns > I0606 17:09:02.958820 29338 leveldb.cpp:271] Iterated through 0 keys in the > db in 465ns > I0606 17:09:02.958880 29338 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0606 17:09:02.959601 29362 recover.cpp:448] Starting replica recovery > I0606 17:09:02.959996 29362 recover.cpp:474] Replica is in EMPTY status > I0606 17:09:02.961241 29357 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (1793)@172.17.0.2:39784 > I0606 17:09:02.961608 29369 recover.cpp:194] Received a recover response from > a replica in EMPTY status > I0606 17:09:02.962347 29357 recover.cpp:565] Updating replica status to > STARTING > I0606 17:09:02.963209 29371 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 706012ns > I0606 17:09:02.963240 29371 replica.cpp:320] Persisted replica status to > STARTING > I0606 17:09:02.963435 29370 recover.cpp:474] Replica is in STARTING status > I0606 17:09:02.963881 29360 master.cpp:382] Master > 7d25f96c-ab51-4074-b613-df1b15675b81 (1ead8e6f9ec5) started on > 172.17.0.2:39784 > I0606 17:09:02.963907 29360 master.cpp:384] Flags at startup: --acls="" > --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate_agents="true" --authenticate_frameworks="true" > --authenticate_http="true" --authenticate_http_frameworks="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/TzbJnN/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --http_framework_authenticators="basic" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_agent_ping_timeouts="5" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --quiet="false" > --recovery_agent_removal_limit="100%" --registry="replicated_log" > --registry_fetch_timeout="1mins" --registry_store_timeout="100secs" > --registry_strict="true" --root_submissions="true" --user_sorter="drf" > --version="false" --webui_dir="/mesos/mesos-1.0.0/_inst/share/mesos/webui" > --work_dir="/tmp/TzbJnN/master" --zk_session_timeout="10secs" > I0606 17:09:02.964334 29360 master.cpp:433] Master only allowing > authenticated frameworks to register > I0606 17:09:02.964355 29360 master.cpp:439] Master only allowing > authenticated agents to register > I0606 17:09:02.964366 29360 master.cpp:445] Master only allowing > authenticated HTTP frameworks to register > I0606 17:09:02.964376 29360 credentials.hpp:37] Loading credentials for > authentication from '/tmp/TzbJnN/credentials' > I0606 17:09:02.964445 29372 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (1794)@172.17.0.2:39784 > I0606 17:09:02.964728 29360 master.cpp:489] Using default 'crammd5' > authenticator > I0606 17:09:02.964891 29360 master.cpp:560] Using default 'basic' HTTP > authenticator > I0606 17:09:02.964926 29361 recover.cpp:194] Received a recover response from > a replica in STARTING status > I0606 17:09:02.965065 29360 master.cpp:640] Using default 'basic' HTTP > framework authenticator > I0606 17:09:02.965330 29360 master.cpp:687] Authorization enabled > I0606 17:09:02.965418 29362 recover.cpp:565] Updating replica status to VOTING > I0606 17:09:02.965591 29359 hierarchical.cpp:142] Initialized hierarchical > allocator process > I0606 17:09:02.965672 29367 whitelist_watcher.cpp:77] No
[jira] [Created] (MESOS-6718) Should destroy DEBUG containers on agent recovery.
Kevin Klues created MESOS-6718: -- Summary: Should destroy DEBUG containers on agent recovery. Key: MESOS-6718 URL: https://issues.apache.org/jira/browse/MESOS-6718 Project: Mesos Issue Type: Bug Reporter: Kevin Klues Assignee: Kevin Klues We need to add support to destroy DEBUG containers on agent recovery. Right now these containers will stick around forever (or until they run to completion). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3386) Port remaining Stout and libprocess tests to Windows
[ https://issues.apache.org/jira/browse/MESOS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-3386: Description: We will need to go through all the test files and investigate any test that's marked `TEST_TEMP_DISABLED_ON_WINDOWS`. Additionally, here is a concise list of the Stout test files that aren't compile as of 12/5/2016: {quote} Stout: path_tests.cpp protobuf_tests.cpp protobuf_tests.pb.cc svn_tests.cpp os/sendfile_tests.cpp os/signals_tests.cpp libprocess: io_tests.cpp reap_tests.cpp {quote} was: Here is a concise list of the Stout tests that don't work yet as of 12/5/2016: {quote} Stout: path_tests.cpp protobuf_tests.cpp protobuf_tests.pb.cc svn_tests.cpp os/sendfile_tests.cpp os/signals_tests.cpp libprocess: io_tests.cpp reap_tests.cpp {quote} > Port remaining Stout and libprocess tests to Windows > > > Key: MESOS-3386 > URL: https://issues.apache.org/jira/browse/MESOS-3386 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Alex Clemmer >Assignee: Alex Clemmer > Labels: build, mesosphere, microsoft, tests > > We will need to go through all the test files and investigate any test that's > marked `TEST_TEMP_DISABLED_ON_WINDOWS`. > Additionally, here is a concise list of the Stout test files that aren't > compile as of 12/5/2016: > {quote} > Stout: > path_tests.cpp > protobuf_tests.cpp > protobuf_tests.pb.cc > svn_tests.cpp > os/sendfile_tests.cpp > os/signals_tests.cpp > libprocess: > io_tests.cpp > reap_tests.cpp > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6494) Clean up the flags parsing in the executors.
[ https://issues.apache.org/jira/browse/MESOS-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-6494: --- Sprint: Mesosphere Sprint 46 (was: Mesosphere Sprint 46, Mesosphere Sprint 47) > Clean up the flags parsing in the executors. > > > Key: MESOS-6494 > URL: https://issues.apache.org/jira/browse/MESOS-6494 > Project: Mesos > Issue Type: Improvement >Reporter: Gastón Kleiman >Assignee: Gastón Kleiman > Labels: mesosphere > > The current executors and the executor libraries use a mix of `stout::flags` > and `os::getenv` to parse flags, leading to a lot of unnecessary and > sometimes duplicated code. > This should be cleaned up, using only {{stout::flags}} to parse flags. > Environment variables should be used for the flags that are common to ALL the > executors (listed in the Executor HTTP API doc). > Command line parameters should be used for flags that apply only to > individual executors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3386) Port remaining Stout and libprocess tests to Windows
[ https://issues.apache.org/jira/browse/MESOS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-3386: Description: Here is a concise list of the Stout tests that don't work yet as of 12/5/2016: {quote} Stout: path_tests.cpp protobuf_tests.cpp protobuf_tests.pb.cc svn_tests.cpp os/sendfile_tests.cpp os/signals_tests.cpp libprocess: io_tests.cpp reap_tests.cpp {quote} was: Here is a concise list of the Stout tests that don't work yet, and their dependencies, and comments about how hard they are to port. Asterisks are next to tests that seem to block Windows MVP. {quote} *dynamiclibrary_tests.cpp -- depends on dynamic load libraries [probably easy, just map to windows dll load API] *flags_tests.cpp -- depends on os.hpp [probably will "just work" if we port os.hpp *gzip_tests.cpp -- depends on gzip.hpp [need to make API-compatible impl of gzip.hpp, which is a medium amount of work] *ip_tests.cpp -- depends on net.hpp and abort.hpp [will probably "just work" after we port net.hpp] *mac_tests.cpp -- depends on abort.hpp and mac.hpp [may or may not be nontrivial, will probably work if we can get mac.hpp] *os_tests.cpp -- depends on a bunch of stuff [probably hardest and most important] *path_tests.cpp -- depends on os.hpp [will probably "just work" if we port os.hpp] protobuf_tests.cpp -- depends on stout/protobuf.hpp (and it can't seem to find the protobuf include dir) *sendfile_test.cpp -- depends on os.hpp and sendfile.hpp [simple port of sendfile is possible; os.hpp is harder] signals_tests.cpp -- depends on os.hpp and signal.hpp [signals will probably be easy; os.hpp is the hard part] *subcommand_tests.cpp -- depends on flags.hpp (which depends on os.hpp) [probably will "just work" if we get os.hpp] svn_tests.cpp -- depends on libapr and libsvn [simple if we get windows to pull these deps] {quote} > Port remaining Stout and libprocess tests to Windows > > > Key: MESOS-3386 > URL: https://issues.apache.org/jira/browse/MESOS-3386 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Alex Clemmer >Assignee: Alex Clemmer > Labels: build, mesosphere, microsoft, tests > > Here is a concise list of the Stout tests that don't work yet as of 12/5/2016: > {quote} > Stout: > path_tests.cpp > protobuf_tests.cpp > protobuf_tests.pb.cc > svn_tests.cpp > os/sendfile_tests.cpp > os/signals_tests.cpp > libprocess: > io_tests.cpp > reap_tests.cpp > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3386) Port remaining Stout and libprocess tests to Windows
[ https://issues.apache.org/jira/browse/MESOS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-3386: Summary: Port remaining Stout and libprocess tests to Windows (was: Port remaining Stout tests to Windows) > Port remaining Stout and libprocess tests to Windows > > > Key: MESOS-3386 > URL: https://issues.apache.org/jira/browse/MESOS-3386 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Alex Clemmer >Assignee: Alex Clemmer > Labels: build, mesosphere, microsoft, tests > > Here is a concise list of the Stout tests that don't work yet, and their > dependencies, and comments about how hard they are to port. Asterisks are > next to tests that seem to block Windows MVP. > {quote} > *dynamiclibrary_tests.cpp -- depends on dynamic load libraries [probably > easy, just map to windows dll load API] > *flags_tests.cpp -- depends on os.hpp [probably will "just work" if we port > os.hpp > *gzip_tests.cpp -- depends on gzip.hpp [need to make API-compatible impl of > gzip.hpp, which is a medium amount of work] > *ip_tests.cpp -- depends on net.hpp and abort.hpp [will probably "just work" > after we port net.hpp] > *mac_tests.cpp -- depends on abort.hpp and mac.hpp [may or may not be > nontrivial, will probably work if we can get mac.hpp] > *os_tests.cpp -- depends on a bunch of stuff [probably hardest and most > important] > *path_tests.cpp -- depends on os.hpp [will probably "just work" if we port > os.hpp] > protobuf_tests.cpp -- depends on stout/protobuf.hpp (and it can't seem to > find the protobuf include dir) > *sendfile_test.cpp -- depends on os.hpp and sendfile.hpp [simple port of > sendfile is possible; os.hpp is harder] > signals_tests.cpp -- depends on os.hpp and signal.hpp [signals will probably > be easy; os.hpp is the hard part] > *subcommand_tests.cpp -- depends on flags.hpp (which depends on os.hpp) > [probably will "just work" if we get os.hpp] > svn_tests.cpp -- depends on libapr and libsvn [simple if we get windows to > pull these deps] > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6717) Add Windows support to agent test harness
[ https://issues.apache.org/jira/browse/MESOS-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-6717: Description: Of particular interest is in `src/tests/CMakeLists.txt` is support enough of the following that we can successfully run agent tests. TEST_HELPER_SRC MESOS_TESTS_UTILS_SRC > Add Windows support to agent test harness > - > > Key: MESOS-6717 > URL: https://issues.apache.org/jira/browse/MESOS-6717 > Project: Mesos > Issue Type: Bug > Components: agent >Reporter: Alex Clemmer >Assignee: Alex Clemmer > Labels: microsoft, windows-mvp > > Of particular interest is in `src/tests/CMakeLists.txt` is support enough of > the following that we can successfully run agent tests. > TEST_HELPER_SRC > MESOS_TESTS_UTILS_SRC -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6717) Add Windows support to agent test harness
[ https://issues.apache.org/jira/browse/MESOS-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-6717: Description: Of particular interest is in `src/tests/CMakeLists.txt` is support enough of the following that we can successfully run agent tests: TEST_HELPER_SRC MESOS_TESTS_UTILS_SRC was: Of particular interest is in `src/tests/CMakeLists.txt` is support enough of the following that we can successfully run agent tests. TEST_HELPER_SRC MESOS_TESTS_UTILS_SRC > Add Windows support to agent test harness > - > > Key: MESOS-6717 > URL: https://issues.apache.org/jira/browse/MESOS-6717 > Project: Mesos > Issue Type: Bug > Components: agent >Reporter: Alex Clemmer >Assignee: Alex Clemmer > Labels: microsoft, windows-mvp > > Of particular interest is in `src/tests/CMakeLists.txt` is support enough of > the following that we can successfully run agent tests: > TEST_HELPER_SRC > MESOS_TESTS_UTILS_SRC -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6717) Add Windows support to agent test harness
Alex Clemmer created MESOS-6717: --- Summary: Add Windows support to agent test harness Key: MESOS-6717 URL: https://issues.apache.org/jira/browse/MESOS-6717 Project: Mesos Issue Type: Bug Components: agent Reporter: Alex Clemmer Assignee: Alex Clemmer -- This message was sent by Atlassian JIRA (v6.3.4#6332)