date:20160630

[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-06-30 Thread Lee Calcote (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358258#comment-15358258
 ] 

Lee Calcote commented on MESOS-4823:


If port-mapping is implemented in the `network/cni` isolator, what would be the 
expected behavior when paired with a container runtime isolator (i.e. a rkt 
container runtime isolator) that already handles port-mapping?

> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish to expose ports that micro-services are 
> listening on, to the outside world. When containers are running on bridged 
> (or ptp) networking this can be achieved by installing port forwarding rules 
> on the agent (using iptables). This can be done in the `network/cni` 
> isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5401) Add ability to inject a Volume of Nvidia GPU-related libraries into a docker container.

2016-06-30 Thread Sunzhe (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358242#comment-15358242
 ] 

Sunzhe commented on MESOS-5401:
---

Oh, I see. What the progress is going on and would you like some help? I have 
got some idea. Since you have got so many things to do, if we work together, 
might make the progress faster.

> Add ability to inject a Volume of Nvidia GPU-related libraries into a docker 
> container.
> ---
>
> Key: MESOS-5401
> URL: https://issues.apache.org/jira/browse/MESOS-5401
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>
> In order to support Nvidia GPUs with docker containers in Mesos, we need to 
> be able to consolidate all Nvidia libraries into a common volume and inject 
> that volume into the container.
> More info on why this is necessary here: 
> https://github.com/NVIDIA/nvidia-docker/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5760) MAC OS Build failed

2016-06-30 Thread Guangya Liu (JIRA)

Guangya Liu created MESOS-5760:
--

 Summary: MAC OS Build failed
 Key: MESOS-5760
 URL: https://issues.apache.org/jira/browse/MESOS-5760
 Project: Mesos
  Issue Type: Bug
Reporter: Guangya Liu
Assignee: Guangya Liu


{code}
arwin -DZOOKEEPER_VERSION=\"3.4.8\" 
-I/usr/local/opt/subversion/include/subversion-1 
-I/usr/local/opt/openssl/include -I/usr/include/apr-1 -I/usr/include/apr-1.0  
-D_THREAD_SAFE -pthread -g -O0 -Wno-unused-local-typedef -std=c++11 
-stdlib=libc++ -DGTEST_USE_OWN_TR1_TUPLE=1 -DGTEST_LANG_CXX11 -MT 
tests/mesos_tests-hdfs_tests.o -MD -MP -MF 
tests/.deps/mesos_tests-hdfs_tests.Tpo -c -o tests/mesos_tests-hdfs_tests.o 
`test -f 'tests/hdfs_tests.cpp' || echo '../../src/'`tests/hdfs_tests.cpp
In file included from ../../src/tests/gc_tests.cpp:42:
// distributed with this work for additional information
../../src/linux/fs.hpp:20:10: fatal error: 'mntent.h' file not found
#include 
 ^
mv -f tests/.deps/mesos_tests-executor_http_api_tests.Tpo 
tests/.deps/mesos_tests-executor_http_api_tests.Po
g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" 
-DPACKAGE_VERSION=\"1.0.0\" -DPACKAGE_STRING=\"mesos\ 1.0.0\" 
-DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" 
-DVERSION=\"1.0.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
-DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
-DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
-DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 
-DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_FTS_H=1 -DHAVE_APR_POOLS_H=1 
-DHAVE_LIBAPR_1=1 -DHAVE_LIBCURL=1 -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" 
-DMESOS_HAS_PYTHON=1 -DHAVE_LIBSASL2=1 -DHAVE_SVN_VERSION_H=1 
-DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 
-DHAVE_LIBZ=1 -I. -I../../src   -Wall -Werror -DLIBDIR=\"/usr/local/lib\" 
-DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" 
-DPKGDATADIR=\"/usr/local/share/mesos\" 
-DPKGMODULEDIR=\"/usr/local/lib/mesos/modules\" -I../../include -I../include 
-I../include/mesos -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS -isystem 
../3rdparty/boost-1.53.0 -I../3rdparty/glog-0.3.3/src 
-I../3rdparty/leveldb-1.4/include -I../../3rdparty/libprocess/include 
-I../3rdparty/nvml-352.79 -I../3rdparty/picojson-1.3.0 
-I../3rdparty/protobuf-2.6.1/src -I../../3rdpa
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Deleted] (MESOS-5620) Add a new flag in master to define the scarce resources.

2016-06-30 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler deleted MESOS-5620:
---


> Add a new flag in master to define the scarce resources.
> 
>
> Key: MESOS-5620
> URL: https://issues.apache.org/jira/browse/MESOS-5620
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> Add a new flag to define the scarce resources, the scarce resources will be 
> excluded from DRF.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Deleted] (MESOS-5621) Enabled calculateShare() to ignore the fairnessExcludeResourceNames

2016-06-30 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler deleted MESOS-5621:
---


> Enabled calculateShare() to ignore the fairnessExcludeResourceNames
> ---
>
> Key: MESOS-5621
> URL: https://issues.apache.org/jira/browse/MESOS-5621
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> Enabled calculateShare() to ignore the fairnessExcludeResourceNames, the 
> fairnessExcludeResourceNames will be a member field for sorter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Deleted] (MESOS-5623) Add test cases for scarce resources

2016-06-30 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler deleted MESOS-5623:
---


> Add test cases for scarce resources
> ---
>
> Key: MESOS-5623
> URL: https://issues.apache.org/jira/browse/MESOS-5623
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> Add some test cases for scarce resources change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Deleted] (MESOS-5622) Update allocator to handle scarce resources

2016-06-30 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler deleted MESOS-5622:
---


> Update allocator to handle scarce resources
> ---
>
> Key: MESOS-5622
> URL: https://issues.apache.org/jira/browse/MESOS-5622
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> The allocator should be updated to handle scarce resources, the idea is 
> exclude scarce resources from all sorters in allocator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5377) Improve DRF behavior with scarce resources.

2016-06-30 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358190#comment-15358190
 ] 

Benjamin Mahler commented on MESOS-5377:


Mitigations have been provided via a GPU framework capability (MESOS-5634) 
(which of course, GPU specific) and by allowing operators to exclude resources 
from fair sharing (see MESOS-5758).

The GPU framework capability helps to reduce the likelihood that non-GPU 
workloads starve out GPU workloads that want to run on the GPU machines. There 
are caveats to this, for example:

(1) If the framework is non-cooperative, it may fill GPU machines with non-GPU 
workloads, and there is currently no revocation mechanism to help evict these 
to make place for the GPU workloads.

(2) A mixed-workload framework (one that runs both GPU and non-GPU workloads) 
cannot tell in general if an offer is from an agent with GPUs present, so it 
must use attributes to *guarantee* that it does not place non-GPU workloads on 
the GPU machine.

The fairness exclusion list allows the operator to ensure that the GPU 
allocation does not quickly dominate the share of the role.

> Improve DRF behavior with scarce resources.
> ---
>
> Key: MESOS-5377
> URL: https://issues.apache.org/jira/browse/MESOS-5377
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation
>Reporter: Benjamin Mahler
>Assignee: Guangya Liu
>
> The allocator currently uses the notion of Weighted [Dominant Resource 
> Fairness|https://www.cs.berkeley.edu/~alig/papers/drf.pdf] (WDRF) to 
> establish a linear notion of fairness across allocation roles.
> DRF behaves well for resources that are present within each machine in a 
> cluster (e.g. CPUs, memory, disk). However, some resources (e.g. GPUs) are 
> only present on a subset of machines in the cluster.
> Consider the behavior when there are the following agents in a cluster:
> 1000 agents with (cpus:4,mem:1024,disk:1024)
> 1 agent with (gpus:1,cpus:4,mem:1024,disk:1024)
> If a role wishes to use both GPU and non-GPU resources for tasks, consuming 1 
> GPU will lead DRF to consider the role to have a 100% share of the cluster, 
> since it consumes 100% of the GPUs in the cluster. This framework will then 
> not receive any other offers.
> Among possible improvements, fairness can have understanding of resource 
> packages. In a sense there is 1 GPU package that is competed on and 1000 
> non-GPU packages competed on, and ideally a role's consumption of the single 
> GPU package does not have a large effect on the role's access to the other 
> 1000 non-GPU packages.
> In the interim, we should consider having a recommended way to deal with 
> scarce resources in the current model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5623) Add test cases for scarce resources

2016-06-30 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-5623:
---

Removing from epic in favor of MESOS-5758.

> Add test cases for scarce resources
> ---
>
> Key: MESOS-5623
> URL: https://issues.apache.org/jira/browse/MESOS-5623
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> Add some test cases for scarce resources change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5621) Enabled calculateShare() to ignore the fairnessExcludeResourceNames

2016-06-30 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-5621:
---

Removing from epic in favor of MESOS-5758.

> Enabled calculateShare() to ignore the fairnessExcludeResourceNames
> ---
>
> Key: MESOS-5621
> URL: https://issues.apache.org/jira/browse/MESOS-5621
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> Enabled calculateShare() to ignore the fairnessExcludeResourceNames, the 
> fairnessExcludeResourceNames will be a member field for sorter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5620) Add a new flag in master to define the scarce resources.

2016-06-30 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-5620:
---

Removing from epic in favor of MESOS-5758.

> Add a new flag in master to define the scarce resources.
> 
>
> Key: MESOS-5620
> URL: https://issues.apache.org/jira/browse/MESOS-5620
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> Add a new flag to define the scarce resources, the scarce resources will be 
> excluded from DRF.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5622) Update allocator to handle scarce resources

2016-06-30 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-5622:
---

Removing from epic in favor of MESOS-5758.

> Update allocator to handle scarce resources
> ---
>
> Key: MESOS-5622
> URL: https://issues.apache.org/jira/browse/MESOS-5622
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> The allocator should be updated to handle scarce resources, the idea is 
> exclude scarce resources from all sorters in allocator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5759) ProcessRemoteLinkTest.RemoteUseStaleLink and RemoteStaleLinkRelink are flaky

2016-06-30 Thread Joseph Wu (JIRA)

Joseph Wu created MESOS-5759:


 Summary: ProcessRemoteLinkTest.RemoteUseStaleLink and 
RemoteStaleLinkRelink are flaky
 Key: MESOS-5759
 URL: https://issues.apache.org/jira/browse/MESOS-5759
 Project: Mesos
  Issue Type: Bug
  Components: libprocess, test
Affects Versions: 1.0.0
Reporter: Joseph Wu
Assignee: Joseph Wu


{{ProcessRemoteLinkTest.RemoteUseStaleLink}} and 
{{ProcessRemoteLinkTest.RemoteStaleLinkRelink}} are failing occasionally with 
the error:
{code}
[ RUN  ] ProcessRemoteLinkTest.RemoteStaleLinkRelink
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0630 07:42:34.661110 1 process.cpp:1066] libprocess is initialized on 
172.17.0.2:56294 with 16 worker threads
E0630 07:42:34.666393 18765 process.cpp:2104] Failed to shutdown socket with fd 
7: Transport endpoint is not connected
/mesos/3rdparty/libprocess/src/tests/process_tests.cpp:1059: Failure
Value of: exitedPid.isPending()
  Actual: false
Expected: true
[  FAILED  ] ProcessRemoteLinkTest.RemoteStaleLinkRelink (56 ms)
{code}

There appears to be a race between establishing a socket connection and the 
test calling {{::shutdown}} on the socket.  Under some circumstances, the 
{{::shutdown}} may actually result in failing the future in 
{{SocketManager::link_connect}} error and thereby trigger 
{{SocketManager::close}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5758) Add ability to exclude resources from fair sharing.

2016-06-30 Thread Benjamin Mahler (JIRA)

Benjamin Mahler created MESOS-5758:
--

 Summary: Add ability to exclude resources from fair sharing.
 Key: MESOS-5758
 URL: https://issues.apache.org/jira/browse/MESOS-5758
 Project: Mesos
  Issue Type: Improvement
  Components: allocation, master
Reporter: Benjamin Mahler
Assignee: Guangya Liu


Currently, the fair sharing implementation has some caveats when dealing with 
"scarce" resources:

http://www.mail-archive.com/dev@mesos.apache.org/msg35631.html
https://issues.apache.org/jira/browse/MESOS-5377

GPUs have been the first type of first-class "scarce" resource in the system, 
and consequently we'd like to introduce mechanisms to mitigate the "scarce" 
resource issues.

The first is a GPU framework capability in MESOS-5634. The second, in this 
ticket, is to provide the operator with the ability to more generally specify 
that some resource types (e.g. gpus) should not be fairly shared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5757) Authorize orphaned tasks

2016-06-30 Thread Vinod Kone (JIRA)

Vinod Kone created MESOS-5757:
-

 Summary: Authorize orphaned tasks
 Key: MESOS-5757
 URL: https://issues.apache.org/jira/browse/MESOS-5757
 Project: Mesos
  Issue Type: Bug
Reporter: Vinod Kone


Currently, orphaned tasks are not filtered (i.e., using authorization) when a 
request is made to /state endpoint. This is inconsistent (and unexpected) with 
how we filter un-orphaned tasks. 

This is tricky because master and hence the authorizer do not have 
FrameworkInfos for these orphaned tasks, until after the corresponding 
frameworks re-register.

One option is for the agent to include FrameworkInfos of all its tasks and 
executors in its re-registration message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3541) Add CMakeLists that builds the Mesos master

2016-06-30 Thread Joseph Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-3541:
-
Shepherd: Joseph Wu  (was: Joris Van Remoortere)

> Add CMakeLists that builds the Mesos master
> ---
>
> Key: MESOS-3541
> URL: https://issues.apache.org/jira/browse/MESOS-3541
> Project: Mesos
>  Issue Type: Task
>  Components: cmake
>Reporter: Alex Clemmer
>Assignee: Srinivas
>  Labels: build, cmake, mesosphere
>
> Right now CMake builds only the agent. We want it to also build the master as 
> part of the libmesos binary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3011) Publish release documentation for major releases on website

2016-06-30 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3011:
--
Shepherd: Vinod Kone

> Publish release documentation for major releases on website
> ---
>
> Key: MESOS-3011
> URL: https://issues.apache.org/jira/browse/MESOS-3011
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation, project website
>Reporter: Paul Brett
>Assignee: Tim Anderegg
>  Labels: documentation, mesosphere
>
> Currently, the website only provides a single version of the documentation.  
> We should publish documentation for each release on the website independently 
> (for example as https://mesos.apache.org/documentation/0.22/index.html, 
> https://mesos.apache.org/documentation/0.23/index.html) and make latest 
> redirect to the current version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5730) Sandbox access authorization should fail for non existing sandboxes.

2016-06-30 Thread Vinod Kone (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358104#comment-15358104
 ] 

Vinod Kone commented on MESOS-5730:
---

Is anyone working on this blocker? What's the status?

> Sandbox access authorization should fail for non existing sandboxes.
> 
>
> Key: MESOS-5730
> URL: https://issues.apache.org/jira/browse/MESOS-5730
> Project: Mesos
>  Issue Type: Bug
>  Components: security
>Affects Versions: 1.0.0
>Reporter: Till Toenshoff
>Priority: Blocker
>  Labels: authorization, mesosphere, security
> Fix For: 1.0.0
>
>
> The local authorizer currently tries to authorize {{ACCESS_SANDBOX}} even if 
> no further object specification - e.g. {{framework_info}} or 
> {{executor_info}}) where specified / available at that time.
> Given that there is likely no sandbox available if there was no 
> {{executor_info}} provided, I think we should actually fail instead of allow 
> or deny (403).
> A failure would result into an IMHO more appropriate ServiceUnavailable 
> (503).  
> See 
> https://github.com/apache/mesos/commit/c8d67590064e35566274116cede9c6a733187b48#diff-dd692b1640b2628014feca01a94ba1e1R241



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5751) Inconsistent display in webui

2016-06-30 Thread Vinod Kone (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358063#comment-15358063
 ] 

Vinod Kone commented on MESOS-5751:
---

What is the sha this cluster was built from? We had an error in the webui that 
was fixed couple days ago.

https://reviews.apache.org/r/49228/

> Inconsistent display in webui
> -
>
> Key: MESOS-5751
> URL: https://issues.apache.org/jira/browse/MESOS-5751
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Reporter: Jay Guo
> Attachments: homepage.png
>
>
> To reproduce:
> 1. Launch master
> 2. Launch agent
> 3. Launch test-framework
> 4. go to webui
> We observe correct statistics on the left panel but no completed tasks on 
> right side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5749) Have maven run in batch mode

2016-06-30 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5749:
--
Assignee: Charles Allen

> Have maven run in batch mode
> 
>
> Key: MESOS-5749
> URL: https://issues.apache.org/jira/browse/MESOS-5749
> Project: Mesos
>  Issue Type: Improvement
>  Components: java api
>Reporter: Charles Allen
>Assignee: Charles Allen
>Priority: Minor
> Fix For: 1.0.0
>
>
> Currently when the Makefile invokes maven, it does not use the -B flag. This 
> ask is to have maven use the -B flag to make it friendly for automated build 
> scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5493) Implement GET_TASKS Call in v1 master API.

2016-06-30 Thread Vinod Kone (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358049#comment-15358049
 ] 

Vinod Kone commented on MESOS-5493:
---

commit bfead176e4f78314f2368d99f670adcf1d3d3d47
Author: Vinod Kone 
Date:   Wed Jun 29 14:26:42 2016 -0700

Updated GetTasks v1 call in master.

The response now distinguishes between active tasks, completed tasks,
pending tasks and orphan tasks to make it easy for clients.
Consequently got rid of offset, limit and offset in the Call because
they don't make sense when we have multiple fields in the response.

Review: https://reviews.apache.org/r/49419


> Implement GET_TASKS Call in v1 master API.
> --
>
> Key: MESOS-5493
> URL: https://issues.apache.org/jira/browse/MESOS-5493
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Jay Guo
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5492) Implement GET_FRAMEWORKS Call in v1 master API.

2016-06-30 Thread Vinod Kone (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358048#comment-15358048
 ] 

Vinod Kone commented on MESOS-5492:
---

commit c0843b6585ecaca6848ce8d8dfa9c9b6a581b73c
Author: Vinod Kone 
Date:   Wed Jun 29 19:03:03 2016 -0700

Updated GetFrameworks v1 call in master.

This change removes tasks and executors information from
GetFrameworks call because we can get that from GetTasks
and GetExecutors (not yet implemented) calls.

Review: https://reviews.apache.org/r/49420


> Implement GET_FRAMEWORKS Call in v1 master API.
> ---
>
> Key: MESOS-5492
> URL: https://issues.apache.org/jira/browse/MESOS-5492
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: zhou xing
> Fix For: 1.0.0
>
>
> Review Request:
> https://reviews.apache.org/r/49136/
> &
> https://reviews.apache.org/r/49137/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5756) Cmake build system needs to regenerate protobufs when they are updated.

2016-06-30 Thread Joseph Wu (JIRA)

Joseph Wu created MESOS-5756:


 Summary: Cmake build system needs to regenerate protobufs when 
they are updated.
 Key: MESOS-5756
 URL: https://issues.apache.org/jira/browse/MESOS-5756
 Project: Mesos
  Issue Type: Improvement
  Components: build, cmake
Reporter: Joseph Wu
Priority: Minor


Generated header files, such as protobufs are currently generated all at once 
in the CMake build system:
https://github.com/apache/mesos/blob/db8b0f16c1c8c6e683a4b788262f307a8bc218e0/cmake/MesosConfigure.cmake#L77-L80

This means, if a protobuf is changed, the CMake build system will not 
regenerate new protobufs unless you delete the generated {{/include}} directory.



Should be a trivial fix, as the CMake protobuf functions merely need to depend 
on the input file:
* 
https://github.com/apache/mesos/blob/db8b0f16c1c8c6e683a4b788262f307a8bc218e0/src/cmake/MesosProtobuf.cmake#L67
* 
https://github.com/apache/mesos/blob/db8b0f16c1c8c6e683a4b788262f307a8bc218e0/src/cmake/MesosProtobuf.cmake#L100



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5755) NVML headers are not installed as part of 3rdparty install with --enable-install-module-dependencies

2016-06-30 Thread Kevin Klues (JIRA)

Kevin Klues created MESOS-5755:
--

 Summary: NVML headers are not installed as part of 3rdparty 
install with --enable-install-module-dependencies
 Key: MESOS-5755
 URL: https://issues.apache.org/jira/browse/MESOS-5755
 Project: Mesos
  Issue Type: Bug
Reporter: Kevin Klues
Assignee: Kevin Klues


See description



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5754) CommandInfo.user not honored in docker containerizer

2016-06-30 Thread Michael Gummelt (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358002#comment-15358002
 ] 

Michael Gummelt commented on MESOS-5754:


> The workaround is to specify a CLI parameter: 

Assuming you're launching through marathon, yes

> CommandInfo.user not honored in docker containerizer
> 
>
> Key: MESOS-5754
> URL: https://issues.apache.org/jira/browse/MESOS-5754
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Michael Gummelt
>
> Repro by creating a framework that starts a task with CommandInfo.user set, 
> and observe that the dockerized executor is still running as the default 
> (e.g. root).
> cc [~kaysoky]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5754) CommandInfo.user not honored in docker containerizer

2016-06-30 Thread Joseph Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357994#comment-15357994
 ] 

Joseph Wu commented on MESOS-5754:
--

I'd be curious if this has affected any users negatively.  If users have not 
noticed this, then they may be inadvertently relying on the incorrect behavior 
(of always running docker tasks as root).

The workaround is to specify a CLI parameter: 
https://github.com/apache/mesos/blob/db8b0f16c1c8c6e683a4b788262f307a8bc218e0/include/mesos/v1/mesos.proto#L1826-L1830
i.e.
{code}
"container" : {
  ...,
  "docker" : {
...,
"parameters" : [{
  "key": "user",
  "value": "not-root"
}]
  }
}
{code}

> CommandInfo.user not honored in docker containerizer
> 
>
> Key: MESOS-5754
> URL: https://issues.apache.org/jira/browse/MESOS-5754
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Michael Gummelt
>
> Repro by creating a framework that starts a task with CommandInfo.user set, 
> and observe that the dockerized executor is still running as the default 
> (e.g. root).
> cc [~kaysoky]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3541) Add CMakeLists that builds the Mesos master

2016-06-30 Thread Joseph Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-3541:
-
Assignee: Srinivas  (was: Alex Clemmer)

> Add CMakeLists that builds the Mesos master
> ---
>
> Key: MESOS-3541
> URL: https://issues.apache.org/jira/browse/MESOS-3541
> Project: Mesos
>  Issue Type: Task
>  Components: cmake
>Reporter: Alex Clemmer
>Assignee: Srinivas
>  Labels: build, cmake, mesosphere
>
> Right now CMake builds only the agent. We want it to also build the master as 
> part of the libmesos binary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5752) ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky

2016-06-30 Thread Yan Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357956#comment-15357956
 ] 

Yan Xu commented on MESOS-5752:
---

Yup, should be due to a race between the task creating the file and the test 
looking for it. Fixing it.

> ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky
> -
>
> Key: MESOS-5752
> URL: https://issues.apache.org/jira/browse/MESOS-5752
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.0
> Environment: Centos 7
>Reporter: Jie Yu
>Assignee: Megha
>
> {noformat}
> [19:17:15] :   [Step 10/10] [ RUN  ] 
> ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.084791 31223 cluster.cpp:155] 
> Creating default 'local' authorizer
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.089606 31223 leveldb.cpp:174] 
> Opened db in 4.713001ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090878 31223 leveldb.cpp:181] 
> Compacted db in 1.253446ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090898 31223 leveldb.cpp:196] 
> Created db iterator in 3553ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090903 31223 leveldb.cpp:202] 
> Seeked to beginning of db in 599ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090909 31223 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 364ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090920 31223 replica.cpp:779] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091115 31243 recover.cpp:451] 
> Starting replica recovery
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091217 31242 recover.cpp:477] 
> Replica is in EMPTY status
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091442 31243 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> (3210)@172.30.2.172:43264
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091569 31240 recover.cpp:197] 
> Received a recover response from a replica in EMPTY status
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091693 31241 recover.cpp:568] 
> Updating replica status to STARTING
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091995 31237 master.cpp:382] 
> Master 9c6bf850-2a66-41f8-a0ad-13c674886778 (ip-172-30-2-172.mesosphere.io) 
> started on 172.30.2.172:43264
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092010 31237 master.cpp:384] Flags 
> at startup: --acls="" --agent_ping_timeout="15secs" 
> --agent_reregister_timeout="10mins" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate_agents="true" 
> --authenticate_frameworks="true" --authenticate_http="true" 
> --authenticate_http_frameworks="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/BD92iQ/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --http_authenticators="basic" --http_framework_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" 
> --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" 
> --quiet="false" --recovery_agent_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/BD92iQ/master" 
> --zk_session_timeout="10secs"
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092118 31237 master.cpp:434] 
> Master only allowing authenticated frameworks to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092123 31237 master.cpp:448] 
> Master only allowing authenticated agents to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092126 31237 master.cpp:461] 
> Master only allowing authenticated HTTP frameworks to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092130 31237 credentials.hpp:37] 
> Loading credentials for authentication from '/tmp/BD92iQ/credentials'
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092205 31237 master.cpp:506] Using 
> default 'crammd5' authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092239 31237 master.cpp:578] Using 
> default 'basic' HTTP authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092298 31237 master.cpp:658] Using 
> default 'basic' HTTP framework authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092339 31237 master.cpp:705] 
> Authorization enabled
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092438 31239 
> whitelist_watcher.cpp:77] No whitelist given
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092481 31244 hierarchical.cpp:142] 
> Initialized hierarchical allocator process
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.093005 31243

[jira] [Updated] (MESOS-5752) ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky

2016-06-30 Thread Yan Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-5752:
--
Assignee: Megha

> ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky
> -
>
> Key: MESOS-5752
> URL: https://issues.apache.org/jira/browse/MESOS-5752
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.0
> Environment: Centos 7
>Reporter: Jie Yu
>Assignee: Megha
>
> {noformat}
> [19:17:15] :   [Step 10/10] [ RUN  ] 
> ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.084791 31223 cluster.cpp:155] 
> Creating default 'local' authorizer
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.089606 31223 leveldb.cpp:174] 
> Opened db in 4.713001ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090878 31223 leveldb.cpp:181] 
> Compacted db in 1.253446ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090898 31223 leveldb.cpp:196] 
> Created db iterator in 3553ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090903 31223 leveldb.cpp:202] 
> Seeked to beginning of db in 599ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090909 31223 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 364ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090920 31223 replica.cpp:779] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091115 31243 recover.cpp:451] 
> Starting replica recovery
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091217 31242 recover.cpp:477] 
> Replica is in EMPTY status
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091442 31243 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> (3210)@172.30.2.172:43264
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091569 31240 recover.cpp:197] 
> Received a recover response from a replica in EMPTY status
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091693 31241 recover.cpp:568] 
> Updating replica status to STARTING
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091995 31237 master.cpp:382] 
> Master 9c6bf850-2a66-41f8-a0ad-13c674886778 (ip-172-30-2-172.mesosphere.io) 
> started on 172.30.2.172:43264
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092010 31237 master.cpp:384] Flags 
> at startup: --acls="" --agent_ping_timeout="15secs" 
> --agent_reregister_timeout="10mins" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate_agents="true" 
> --authenticate_frameworks="true" --authenticate_http="true" 
> --authenticate_http_frameworks="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/BD92iQ/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --http_authenticators="basic" --http_framework_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" 
> --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" 
> --quiet="false" --recovery_agent_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/BD92iQ/master" 
> --zk_session_timeout="10secs"
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092118 31237 master.cpp:434] 
> Master only allowing authenticated frameworks to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092123 31237 master.cpp:448] 
> Master only allowing authenticated agents to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092126 31237 master.cpp:461] 
> Master only allowing authenticated HTTP frameworks to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092130 31237 credentials.hpp:37] 
> Loading credentials for authentication from '/tmp/BD92iQ/credentials'
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092205 31237 master.cpp:506] Using 
> default 'crammd5' authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092239 31237 master.cpp:578] Using 
> default 'basic' HTTP authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092298 31237 master.cpp:658] Using 
> default 'basic' HTTP framework authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092339 31237 master.cpp:705] 
> Authorization enabled
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092438 31239 
> whitelist_watcher.cpp:77] No whitelist given
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092481 31244 hierarchical.cpp:142] 
> Initialized hierarchical allocator process
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.093005 31243 leveldb.cpp:304] 
> Persisting metadata (8 bytes) to leveldb took 1.280756ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.093022

[jira] [Created] (MESOS-5754) CommandInfo.user not honored in docker containerizer

2016-06-30 Thread Michael Gummelt (JIRA)

Michael Gummelt created MESOS-5754:
--

 Summary: CommandInfo.user not honored in docker containerizer
 Key: MESOS-5754
 URL: https://issues.apache.org/jira/browse/MESOS-5754
 Project: Mesos
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Michael Gummelt


Repro by creating a framework that starts a task with CommandInfo.user set, and 
observe that the dockerized executor is still running as the default (e.g. 
root).

cc [~kaysoky]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5752) ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky

2016-06-30 Thread Yan Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357858#comment-15357858
 ] 

Yan Xu commented on MESOS-5752:
---

Sure. /cc [~megha.sharma]

> ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky
> -
>
> Key: MESOS-5752
> URL: https://issues.apache.org/jira/browse/MESOS-5752
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.0
> Environment: Centos 7
>Reporter: Jie Yu
>
> {noformat}
> [19:17:15] :   [Step 10/10] [ RUN  ] 
> ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.084791 31223 cluster.cpp:155] 
> Creating default 'local' authorizer
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.089606 31223 leveldb.cpp:174] 
> Opened db in 4.713001ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090878 31223 leveldb.cpp:181] 
> Compacted db in 1.253446ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090898 31223 leveldb.cpp:196] 
> Created db iterator in 3553ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090903 31223 leveldb.cpp:202] 
> Seeked to beginning of db in 599ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090909 31223 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 364ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090920 31223 replica.cpp:779] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091115 31243 recover.cpp:451] 
> Starting replica recovery
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091217 31242 recover.cpp:477] 
> Replica is in EMPTY status
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091442 31243 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> (3210)@172.30.2.172:43264
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091569 31240 recover.cpp:197] 
> Received a recover response from a replica in EMPTY status
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091693 31241 recover.cpp:568] 
> Updating replica status to STARTING
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091995 31237 master.cpp:382] 
> Master 9c6bf850-2a66-41f8-a0ad-13c674886778 (ip-172-30-2-172.mesosphere.io) 
> started on 172.30.2.172:43264
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092010 31237 master.cpp:384] Flags 
> at startup: --acls="" --agent_ping_timeout="15secs" 
> --agent_reregister_timeout="10mins" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate_agents="true" 
> --authenticate_frameworks="true" --authenticate_http="true" 
> --authenticate_http_frameworks="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/BD92iQ/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --http_authenticators="basic" --http_framework_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" 
> --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" 
> --quiet="false" --recovery_agent_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/BD92iQ/master" 
> --zk_session_timeout="10secs"
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092118 31237 master.cpp:434] 
> Master only allowing authenticated frameworks to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092123 31237 master.cpp:448] 
> Master only allowing authenticated agents to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092126 31237 master.cpp:461] 
> Master only allowing authenticated HTTP frameworks to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092130 31237 credentials.hpp:37] 
> Loading credentials for authentication from '/tmp/BD92iQ/credentials'
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092205 31237 master.cpp:506] Using 
> default 'crammd5' authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092239 31237 master.cpp:578] Using 
> default 'basic' HTTP authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092298 31237 master.cpp:658] Using 
> default 'basic' HTTP framework authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092339 31237 master.cpp:705] 
> Authorization enabled
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092438 31239 
> whitelist_watcher.cpp:77] No whitelist given
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092481 31244 hierarchical.cpp:142] 
> Initialized hierarchical allocator process
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.093005 31243 leveldb.cpp:304] 
> Persisting metadata (8 bytes) to leveldb took 1.280756ms
> [19:17:15]W:   [Step 10/10] I0630

[jira] [Updated] (MESOS-5433) Add 'distcheck' target to CMake build

2016-06-30 Thread Joseph Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-5433:
-
Component/s: cmake

> Add 'distcheck' target to CMake build
> -
>
> Key: MESOS-5433
> URL: https://issues.apache.org/jira/browse/MESOS-5433
> Project: Mesos
>  Issue Type: Improvement
>  Components: cmake
>Reporter: Juan Larriba
>Assignee: Juan Larriba
>
> We should add the "distcheck" option to the makefiles created by CMake 
> configuration. 
> This way, all the testing battery can be executed during CMake generation. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5709) Authorization for /roles

2016-06-30 Thread Joerg Schad (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-5709:
---
Shepherd: Vinod Kone  (was: Adam B)

> Authorization for /roles
> 
>
> Key: MESOS-5709
> URL: https://issues.apache.org/jira/browse/MESOS-5709
> Project: Mesos
>  Issue Type: Task
>  Components: security
>Reporter: Adam B
>Assignee: Joerg Schad
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> The /roles endpoint exposes the list of all roles and their weights, as well 
> as the list of all frameworkIds registered with each role. This is a superset 
> of the information exposed on GET /weights, which we already protect. We 
> should protect the data in /roles the same way.
> - Should we reuse VIEW_FRAMEWORK with role (from /state)?
> - Should we add a new VIEW_ROLE and adapt GET_WEIGHTS to use it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5749) Have maven run in batch mode

2016-06-30 Thread Charles Allen (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357809#comment-15357809
 ] 

Charles Allen commented on MESOS-5749:
--

discussion in https://reviews.apache.org/r/49422/

> Have maven run in batch mode
> 
>
> Key: MESOS-5749
> URL: https://issues.apache.org/jira/browse/MESOS-5749
> Project: Mesos
>  Issue Type: Improvement
>  Components: java api
>Reporter: Charles Allen
>Priority: Minor
>
> Currently when the Makefile invokes maven, it does not use the -B flag. This 
> ask is to have maven use the -B flag to make it friendly for automated build 
> scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5752) ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky

2016-06-30 Thread Jie Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357779#comment-15357779
 ] 

Jie Yu commented on MESOS-5752:
---

[~xujyan] Can you take a look? It was introduced in this patch:
https://github.com/apache/mesos/commit/f23ef4c158cf5ed69e6d90c2c678654cfa0b48a6

> ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky
> -
>
> Key: MESOS-5752
> URL: https://issues.apache.org/jira/browse/MESOS-5752
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.0
> Environment: Centos 7
>Reporter: Jie Yu
>
> {noformat}
> [19:17:15] :   [Step 10/10] [ RUN  ] 
> ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.084791 31223 cluster.cpp:155] 
> Creating default 'local' authorizer
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.089606 31223 leveldb.cpp:174] 
> Opened db in 4.713001ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090878 31223 leveldb.cpp:181] 
> Compacted db in 1.253446ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090898 31223 leveldb.cpp:196] 
> Created db iterator in 3553ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090903 31223 leveldb.cpp:202] 
> Seeked to beginning of db in 599ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090909 31223 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 364ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090920 31223 replica.cpp:779] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091115 31243 recover.cpp:451] 
> Starting replica recovery
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091217 31242 recover.cpp:477] 
> Replica is in EMPTY status
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091442 31243 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> (3210)@172.30.2.172:43264
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091569 31240 recover.cpp:197] 
> Received a recover response from a replica in EMPTY status
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091693 31241 recover.cpp:568] 
> Updating replica status to STARTING
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091995 31237 master.cpp:382] 
> Master 9c6bf850-2a66-41f8-a0ad-13c674886778 (ip-172-30-2-172.mesosphere.io) 
> started on 172.30.2.172:43264
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092010 31237 master.cpp:384] Flags 
> at startup: --acls="" --agent_ping_timeout="15secs" 
> --agent_reregister_timeout="10mins" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate_agents="true" 
> --authenticate_frameworks="true" --authenticate_http="true" 
> --authenticate_http_frameworks="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/BD92iQ/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --http_authenticators="basic" --http_framework_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" 
> --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" 
> --quiet="false" --recovery_agent_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/BD92iQ/master" 
> --zk_session_timeout="10secs"
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092118 31237 master.cpp:434] 
> Master only allowing authenticated frameworks to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092123 31237 master.cpp:448] 
> Master only allowing authenticated agents to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092126 31237 master.cpp:461] 
> Master only allowing authenticated HTTP frameworks to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092130 31237 credentials.hpp:37] 
> Loading credentials for authentication from '/tmp/BD92iQ/credentials'
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092205 31237 master.cpp:506] Using 
> default 'crammd5' authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092239 31237 master.cpp:578] Using 
> default 'basic' HTTP authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092298 31237 master.cpp:658] Using 
> default 'basic' HTTP framework authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092339 31237 master.cpp:705] 
> Authorization enabled
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092438 31239 
> whitelist_watcher.cpp:77] No whitelist given
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092481 31244 hierarchical.cpp:142] 
> Initialized hierarchical allocator process
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.093005

[jira] [Updated] (MESOS-5752) ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky

2016-06-30 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5752:
--
Affects Version/s: 1.0.0

> ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky
> -
>
> Key: MESOS-5752
> URL: https://issues.apache.org/jira/browse/MESOS-5752
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.0
> Environment: Centos 7
>Reporter: Jie Yu
>
> {noformat}
> [19:17:15] :   [Step 10/10] [ RUN  ] 
> ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.084791 31223 cluster.cpp:155] 
> Creating default 'local' authorizer
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.089606 31223 leveldb.cpp:174] 
> Opened db in 4.713001ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090878 31223 leveldb.cpp:181] 
> Compacted db in 1.253446ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090898 31223 leveldb.cpp:196] 
> Created db iterator in 3553ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090903 31223 leveldb.cpp:202] 
> Seeked to beginning of db in 599ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090909 31223 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 364ns
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.090920 31223 replica.cpp:779] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091115 31243 recover.cpp:451] 
> Starting replica recovery
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091217 31242 recover.cpp:477] 
> Replica is in EMPTY status
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091442 31243 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> (3210)@172.30.2.172:43264
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091569 31240 recover.cpp:197] 
> Received a recover response from a replica in EMPTY status
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091693 31241 recover.cpp:568] 
> Updating replica status to STARTING
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.091995 31237 master.cpp:382] 
> Master 9c6bf850-2a66-41f8-a0ad-13c674886778 (ip-172-30-2-172.mesosphere.io) 
> started on 172.30.2.172:43264
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092010 31237 master.cpp:384] Flags 
> at startup: --acls="" --agent_ping_timeout="15secs" 
> --agent_reregister_timeout="10mins" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate_agents="true" 
> --authenticate_frameworks="true" --authenticate_http="true" 
> --authenticate_http_frameworks="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/BD92iQ/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --http_authenticators="basic" --http_framework_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" 
> --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" 
> --quiet="false" --recovery_agent_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/BD92iQ/master" 
> --zk_session_timeout="10secs"
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092118 31237 master.cpp:434] 
> Master only allowing authenticated frameworks to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092123 31237 master.cpp:448] 
> Master only allowing authenticated agents to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092126 31237 master.cpp:461] 
> Master only allowing authenticated HTTP frameworks to register
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092130 31237 credentials.hpp:37] 
> Loading credentials for authentication from '/tmp/BD92iQ/credentials'
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092205 31237 master.cpp:506] Using 
> default 'crammd5' authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092239 31237 master.cpp:578] Using 
> default 'basic' HTTP authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092298 31237 master.cpp:658] Using 
> default 'basic' HTTP framework authenticator
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092339 31237 master.cpp:705] 
> Authorization enabled
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092438 31239 
> whitelist_watcher.cpp:77] No whitelist given
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.092481 31244 hierarchical.cpp:142] 
> Initialized hierarchical allocator process
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.093005 31243 leveldb.cpp:304] 
> Persisting metadata (8 bytes) to leveldb took 1.280756ms
> [19:17:15]W:   [Step 10/10] I0630 19:17:15.093022 31243 replica.cpp:320]

[jira] [Created] (MESOS-5752) ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky

2016-06-30 Thread Jie Yu (JIRA)

Jie Yu created MESOS-5752:
-

 Summary: ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint 
is flaky
 Key: MESOS-5752
 URL: https://issues.apache.org/jira/browse/MESOS-5752
 Project: Mesos
  Issue Type: Bug
 Environment: Centos 7
Reporter: Jie Yu


{noformat}
[19:17:15] : [Step 10/10] [ RUN  ] 
ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint
[19:17:15]W: [Step 10/10] I0630 19:17:15.084791 31223 cluster.cpp:155] 
Creating default 'local' authorizer
[19:17:15]W: [Step 10/10] I0630 19:17:15.089606 31223 leveldb.cpp:174] 
Opened db in 4.713001ms
[19:17:15]W: [Step 10/10] I0630 19:17:15.090878 31223 leveldb.cpp:181] 
Compacted db in 1.253446ms
[19:17:15]W: [Step 10/10] I0630 19:17:15.090898 31223 leveldb.cpp:196] 
Created db iterator in 3553ns
[19:17:15]W: [Step 10/10] I0630 19:17:15.090903 31223 leveldb.cpp:202] 
Seeked to beginning of db in 599ns
[19:17:15]W: [Step 10/10] I0630 19:17:15.090909 31223 leveldb.cpp:271] 
Iterated through 0 keys in the db in 364ns
[19:17:15]W: [Step 10/10] I0630 19:17:15.090920 31223 replica.cpp:779] 
Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
[19:17:15]W: [Step 10/10] I0630 19:17:15.091115 31243 recover.cpp:451] 
Starting replica recovery
[19:17:15]W: [Step 10/10] I0630 19:17:15.091217 31242 recover.cpp:477] 
Replica is in EMPTY status
[19:17:15]W: [Step 10/10] I0630 19:17:15.091442 31243 replica.cpp:673] 
Replica in EMPTY status received a broadcasted recover request from 
(3210)@172.30.2.172:43264
[19:17:15]W: [Step 10/10] I0630 19:17:15.091569 31240 recover.cpp:197] 
Received a recover response from a replica in EMPTY status
[19:17:15]W: [Step 10/10] I0630 19:17:15.091693 31241 recover.cpp:568] 
Updating replica status to STARTING
[19:17:15]W: [Step 10/10] I0630 19:17:15.091995 31237 master.cpp:382] 
Master 9c6bf850-2a66-41f8-a0ad-13c674886778 (ip-172-30-2-172.mesosphere.io) 
started on 172.30.2.172:43264
[19:17:15]W: [Step 10/10] I0630 19:17:15.092010 31237 master.cpp:384] Flags 
at startup: --acls="" --agent_ping_timeout="15secs" 
--agent_reregister_timeout="10mins" --allocation_interval="1secs" 
--allocator="HierarchicalDRF" --authenticate_agents="true" 
--authenticate_frameworks="true" --authenticate_http="true" 
--authenticate_http_frameworks="true" --authenticators="crammd5" 
--authorizers="local" --credentials="/tmp/BD92iQ/credentials" 
--framework_sorter="drf" --help="false" --hostname_lookup="true" 
--http_authenticators="basic" --http_framework_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" 
--max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" 
--quiet="false" --recovery_agent_removal_limit="100%" 
--registry="replicated_log" --registry_fetch_timeout="1mins" 
--registry_store_timeout="100secs" --registry_strict="true" 
--root_submissions="true" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/BD92iQ/master" 
--zk_session_timeout="10secs"
[19:17:15]W: [Step 10/10] I0630 19:17:15.092118 31237 master.cpp:434] 
Master only allowing authenticated frameworks to register
[19:17:15]W: [Step 10/10] I0630 19:17:15.092123 31237 master.cpp:448] 
Master only allowing authenticated agents to register
[19:17:15]W: [Step 10/10] I0630 19:17:15.092126 31237 master.cpp:461] 
Master only allowing authenticated HTTP frameworks to register
[19:17:15]W: [Step 10/10] I0630 19:17:15.092130 31237 credentials.hpp:37] 
Loading credentials for authentication from '/tmp/BD92iQ/credentials'
[19:17:15]W: [Step 10/10] I0630 19:17:15.092205 31237 master.cpp:506] Using 
default 'crammd5' authenticator
[19:17:15]W: [Step 10/10] I0630 19:17:15.092239 31237 master.cpp:578] Using 
default 'basic' HTTP authenticator
[19:17:15]W: [Step 10/10] I0630 19:17:15.092298 31237 master.cpp:658] Using 
default 'basic' HTTP framework authenticator
[19:17:15]W: [Step 10/10] I0630 19:17:15.092339 31237 master.cpp:705] 
Authorization enabled
[19:17:15]W: [Step 10/10] I0630 19:17:15.092438 31239 
whitelist_watcher.cpp:77] No whitelist given
[19:17:15]W: [Step 10/10] I0630 19:17:15.092481 31244 hierarchical.cpp:142] 
Initialized hierarchical allocator process
[19:17:15]W: [Step 10/10] I0630 19:17:15.093005 31243 leveldb.cpp:304] 
Persisting metadata (8 bytes) to leveldb took 1.280756ms
[19:17:15]W: [Step 10/10] I0630 19:17:15.093022 31243 replica.cpp:320] 
Persisted replica status to STARTING
[19:17:15]W: [Step 10/10] I0630 19:17:15.093035 31240 master.cpp:1971] The 
newly elected leader is master@172.30.2.172:43264 with id 
9c6bf850-2a66-41f8-a0ad-13c674886778
[19:17:15]W: [Step 10/10] I0630 19:17:15.093063 31240 master.cpp:1984] 
Elected as the leading master!

[jira] [Commented] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-06-30 Thread Alexander Rukletsov (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357761#comment-15357761
 ] 

Alexander Rukletsov commented on MESOS-5379:


Since authz is a major part of the 1.0 release, I think we should not add extra 
difficulty for operators by supplying misleading docs. However, I won't insist 
and bump it to blocker again if [~adam-mesos] (authz lead) and [~vinodkone] 
(release manager) decide otherwise.

> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 1.0.0
>Reporter: Benjamin Bannier
>Priority: Critical
>  Labels: mesosphere, tech-debt
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5709) Authorization for /roles

2016-06-30 Thread Joerg Schad (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357722#comment-15357722
 ] 

Joerg Schad commented on MESOS-5709:


(note that the regenaration of the endpoint docu is missing, but seemingly 
there is supposed to be one big regeneration run before cutting the rc)

> Authorization for /roles
> 
>
> Key: MESOS-5709
> URL: https://issues.apache.org/jira/browse/MESOS-5709
> Project: Mesos
>  Issue Type: Task
>  Components: security
>Reporter: Adam B
>Assignee: Joerg Schad
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> The /roles endpoint exposes the list of all roles and their weights, as well 
> as the list of all frameworkIds registered with each role. This is a superset 
> of the information exposed on GET /weights, which we already protect. We 
> should protect the data in /roles the same way.
> - Should we reuse VIEW_FRAMEWORK with role (from /state)?
> - Should we add a new VIEW_ROLE and adapt GET_WEIGHTS to use it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5743) Added a flag parser for hashset.

2016-06-30 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357707#comment-15357707
 ] 

Benjamin Mahler commented on MESOS-5743:


I changed this from a bug to an improvement.

> Added a flag parser for hashset.
> -
>
> Key: MESOS-5743
> URL: https://issues.apache.org/jira/browse/MESOS-5743
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Guangya Liu
>Assignee: Guangya Liu
> Fix For: 1.0.0
>
>
> We are introducing a new flag in master to set multiple exclude resource 
> names from sorter, it is better add a lag parser for hashset to 
> parse the flag for multiple exclude resource names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5743) Added a flag parser for hashset.

2016-06-30 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-5743:
---
Issue Type: Improvement  (was: Bug)

> Added a flag parser for hashset.
> -
>
> Key: MESOS-5743
> URL: https://issues.apache.org/jira/browse/MESOS-5743
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Guangya Liu
>Assignee: Guangya Liu
> Fix For: 1.0.0
>
>
> We are introducing a new flag in master to set multiple exclude resource 
> names from sorter, it is better add a lag parser for hashset to 
> parse the flag for multiple exclude resource names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5707) LocalAuthorizer should error if passed a GET_ENDPOINT ACL with an unhandled path

2016-06-30 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5707:
--
Shepherd: Vinod Kone

> LocalAuthorizer should error if passed a GET_ENDPOINT ACL with an unhandled 
> path
> 
>
> Key: MESOS-5707
> URL: https://issues.apache.org/jira/browse/MESOS-5707
> Project: Mesos
>  Issue Type: Task
>  Components: security
>Reporter: Adam B
>Assignee: Alexander Rojas
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> Since GET_ENDPOINT_WITH_PATH doesn't (yet) work with any arbitrary path, we 
> should
> a) validate --acls and error if GET_ENDPOINT_WITH_PATH has a path object that 
> doesn't match an endpoint that uses this authz strategy.
> b) document exactly which endpoints support GET_ENDPOINT_WITH_PATH



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5709) Authorization for /roles

2016-06-30 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5709:
--
Priority: Blocker  (was: Minor)

> Authorization for /roles
> 
>
> Key: MESOS-5709
> URL: https://issues.apache.org/jira/browse/MESOS-5709
> Project: Mesos
>  Issue Type: Task
>  Components: security
>Reporter: Adam B
>Assignee: Joerg Schad
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> The /roles endpoint exposes the list of all roles and their weights, as well 
> as the list of all frameworkIds registered with each role. This is a superset 
> of the information exposed on GET /weights, which we already protect. We 
> should protect the data in /roles the same way.
> - Should we reuse VIEW_FRAMEWORK with role (from /state)?
> - Should we add a new VIEW_ROLE and adapt GET_WEIGHTS to use it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5703) Authorize operator endpoints for Mesos 1.0

2016-06-30 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5703:
--
Priority: Critical  (was: Blocker)

Removing the blocker for the epic in favor of adding blockers to the relevant 
tickets included in the epic.

> Authorize operator endpoints for Mesos 1.0
> --
>
> Key: MESOS-5703
> URL: https://issues.apache.org/jira/browse/MESOS-5703
> Project: Mesos
>  Issue Type: Epic
>  Components: security
>Reporter: Adam B
>Assignee: Adam B
>Priority: Critical
>  Labels: authorization, mesosphere, security
> Fix For: 1.0.0
>
>
> We've authorized many endpoints in our work on MESOS-4843 and MESOS-5150, but 
> we need to tie it all together into a cohesive story and document the 
> authorization model/strategy. This epic will collect issues to round out the 
> Mesos 1.0 authorization story.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5707) LocalAuthorizer should error if passed a GET_ENDPOINT ACL with an unhandled path

2016-06-30 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5707:
--
Priority: Blocker  (was: Critical)

> LocalAuthorizer should error if passed a GET_ENDPOINT ACL with an unhandled 
> path
> 
>
> Key: MESOS-5707
> URL: https://issues.apache.org/jira/browse/MESOS-5707
> Project: Mesos
>  Issue Type: Task
>  Components: security
>Reporter: Adam B
>Assignee: Alexander Rojas
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> Since GET_ENDPOINT_WITH_PATH doesn't (yet) work with any arbitrary path, we 
> should
> a) validate --acls and error if GET_ENDPOINT_WITH_PATH has a path object that 
> doesn't match an endpoint that uses this authz strategy.
> b) document exactly which endpoints support GET_ENDPOINT_WITH_PATH



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5705) ZK credential is exposed in /flags and /state

2016-06-30 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5705:
--
Shepherd: Vinod Kone

> ZK credential is exposed in /flags and /state
> -
>
> Key: MESOS-5705
> URL: https://issues.apache.org/jira/browse/MESOS-5705
> Project: Mesos
>  Issue Type: Task
>  Components: master, security
>Reporter: Adam B
>Assignee: Alexander Rojas
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> Mesos allows zk credentials to be embedded in the zk url, but exposes these 
> credentials in the /flags and /state endpoint. Even though /state is 
> authorized, it only filters out frameworks/tasks, so the top-level flags are 
> shown to any authenticated user.
> "zk": "zk://dcos_mesos_master:my_secret_password@127.0.0.1:2181/mesos",
> We need to find some way to hide this data, or even add a first-class 
> VIEW_FLAGS acl that applies to any endpoint that exposes flags.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5723) SSL-enabled libprocess will leak incoming links to forks

2016-06-30 Thread Joseph Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-5723:
-
Fix Version/s: 0.27.4
   0.28.3

> SSL-enabled libprocess will leak incoming links to forks
> 
>
> Key: MESOS-5723
> URL: https://issues.apache.org/jira/browse/MESOS-5723
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Affects Versions: 0.24.0, 0.25.0, 0.26.0, 0.27.0, 0.28.0
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Blocker
>  Labels: libprocess, mesosphere, ssl
> Fix For: 0.28.3, 1.0.0, 0.27.4
>
>
> Encountered two different buggy behaviors that can be tracked down to the 
> same underlying problem.
> Repro #1 (non-crashy):
> (1) Start a master.  Doesn't matter if SSL is enabled or not.
> (2) Start an agent, with SSL enabled.  Downgrade support has the same 
> problem.  The master/agent {{link}} to one another.
> (3) Run a sleep task.  Keep this alive.  If you inspect FDs at this point, 
> you'll notice the task has inherited the {{link}} FD (master -> agent).
> (4) Restart the agent.  Due to (3), the master's {{link}} stays open.
> (5) Check master's logs for the agent's re-registration message.
> (6) Check the agent's logs for re-registration.  The message will not appear. 
>  The master is actually using the old {{link}} which is not connected to the 
> agent.
> 
> Repro #2 (crashy):
> (1) Start a master.  Doesn't matter if SSL is enabled or not.
> (2) Start an agent, with SSL enabled.  Downgrade support has the same problem.
> (3) Run ~100 sleep task one after the other, keep them all alive.  Each task 
> links back to the agent.  Due to an FD leak, each task will inherit the 
> incoming links from all other actors...
> (4) At some point, the agent will run out of FDs and kernel panic.
> 
> It appears that the SSL socket {{accept}} call is missing {{os::nonblock}} 
> and {{os::cloexec}} calls:
> https://github.com/apache/mesos/blob/4b91d936f50885b6a66277e26ea3c32fe942cf1a/3rdparty/libprocess/src/libevent_ssl_socket.cpp#L794-L806
> For reference, here's {{poll}} socket's {{accept}}:
> https://github.com/apache/mesos/blob/4b91d936f50885b6a66277e26ea3c32fe942cf1a/3rdparty/libprocess/src/poll_socket.cpp#L53-L75



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5748) Potential segfault in `link` and `send` when linking to a remote process

2016-06-30 Thread Joseph Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-5748:
-
Fix Version/s: 0.27.4
   0.28.3

> Potential segfault in `link` and `send` when linking to a remote process
> 
>
> Key: MESOS-5748
> URL: https://issues.apache.org/jira/browse/MESOS-5748
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Affects Versions: 0.22.0, 0.23.0, 0.24.0, 0.25.0, 0.26.0, 0.27.0, 0.28.0
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: libprocess, mesosphere
> Fix For: 0.28.3, 1.0.0, 0.27.4
>
>
> There is a race in the SocketManager, between a remote {{link}} and 
> disconnection of the underlying socket.
> We potentially segfault here: 
> https://github.com/apache/mesos/blob/215e79f571a989e998488077d713c28c7528926e/3rdparty/libprocess/src/process.cpp#L1512
> {{\*socket}} dereferences the shared pointer underpinning the {{Socket*}} 
> object.  However, the code above this line actually has ownership of the 
> pointer:
> https://github.com/apache/mesos/blob/215e79f571a989e998488077d713c28c7528926e/3rdparty/libprocess/src/process.cpp#L1494-L1499
> If the socket dies during the link, the {{ignore_recv_data}} may delete the 
> Socket underneath {{link}}:
> https://github.com/apache/mesos/blob/215e79f571a989e998488077d713c28c7528926e/3rdparty/libprocess/src/process.cpp#L1399-L1411
> 
> The same race exists for {{send}}.
> This race was discovered while running a new test in repetition:
> https://reviews.apache.org/r/49175/
> On OSX, I hit the race consistently every 500-800 repetitions:
> {code}
> 3rdparty/libprocess/libprocess-tests 
> --gtest_filter="ProcessRemoteLinkTest.RemoteLink"  --gtest_break_on_failure 
> --gtest_repeat=1000
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5576) Masters may drop the first message they send between masters after a network partition

2016-06-30 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5576:
---
Shepherd: Benjamin Mahler

> Masters may drop the first message they send between masters after a network 
> partition
> --
>
> Key: MESOS-5576
> URL: https://issues.apache.org/jira/browse/MESOS-5576
> Project: Mesos
>  Issue Type: Improvement
>  Components: leader election, master, replicated log
>Affects Versions: 0.28.2
> Environment: Observed in an OpenStack environment where each master 
> lives on a separate VM.
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: mesosphere
> Fix For: 1.0.0
>
>
> We observed the following situation in a cluster of five masters:
> || Time || Master 1 || Master 2 || Master 3 || Master 4 || Master 5 ||
> | 0 | Follower | Follower | Follower | Follower | Leader |
> | 1 | Follower | Follower | Follower | Follower || Partitioned from cluster 
> by downing this VM's network ||
> | 2 || Elected Leader by ZK | Voting | Voting | Voting | Suicides due to lost 
> leadership |
> | 3 | Performs consensus | Replies to leader | Replies to leader | Replies to 
> leader | Still down |
> | 4 | Performs writing | Acks to leader | Acks to leader | Acks to leader | 
> Still down |
> | 5 | Leader | Follower | Follower | Follower | Still down |
> | 6 | Leader | Follower | Follower | Follower | Comes back up |
> | 7 | Leader | Follower | Follower | Follower | Follower |
> | 8 || Partitioned in the same way as Master 5 | Follower | Follower | 
> Follower | Follower |
> | 9 | Suicides due to lost leadership || Elected Leader by ZK | Follower | 
> Follower | Follower |
> | 10 | Still down | Performs consensus | Replies to leader | Replies to 
> leader || Doesn't get the message! ||
> | 11 | Still down | Performs writing | Acks to leader | Acks to leader || 
> Acks to leader ||
> | 12 | Still down | Leader | Follower | Follower | Follower |
> Master 2 sends a series of messages to the recently-restarted Master 5.  The 
> first message is dropped, but subsequent messages are not dropped.
> This appears to be due to a stale link between the masters.  Before leader 
> election, the replicated log actors create a network watcher, which adds 
> links to masters that join the ZK group:
> https://github.com/apache/mesos/blob/7a23d0da817be4e8f68d96f524cecf802431033c/src/log/network.hpp#L157-L159
> This link does not appear to break (Master 2 -> 5) when Master 5 goes down, 
> perhaps due to how the network partition was induced (in the hypervisor 
> layer, rather than in the VM itself).
> When Master 2 tries to send an {{PromiseRequest}} to Master 5, we do not 
> observe the [expected log 
> message|https://github.com/apache/mesos/blob/7a23d0da817be4e8f68d96f524cecf802431033c/src/log/replica.cpp#L493-L494]
> Instead, we see a log line in Master 2:
> {code}
> process.cpp:2040] Failed to shutdown socket with fd 27: Transport endpoint is 
> not connected
> {code}
> The broken link is removed by the libprocess {{socket_manager}} and the 
> following {{WriteRequest}} from Master 2 to Master 5 succeeds via a new 
> socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5704) Fine-grained authorization on /frameworks

2016-06-30 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5704:
---
Shepherd: Vinod Kone

> Fine-grained authorization on /frameworks
> -
>
> Key: MESOS-5704
> URL: https://issues.apache.org/jira/browse/MESOS-5704
> Project: Mesos
>  Issue Type: Task
>  Components: master, security
>Reporter: Adam B
>Assignee: Alexander Rojas
>Priority: Critical
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> Even if ACLs were defined for the actions VIEW_FRAMEWORKS,
> VIEW_EXECUTORS and VIEW_TASKS, the data these actions were
> supposed to protect, could still leaked through the master's
> /frameworks endpoint, since it didn't enable any authorization
> mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5706) GET_ENDPOINT_WITH_PATH authz doesn't make sense for /flags

2016-06-30 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5706:
---
Shepherd: Adam B

> GET_ENDPOINT_WITH_PATH authz doesn't make sense for /flags
> --
>
> Key: MESOS-5706
> URL: https://issues.apache.org/jira/browse/MESOS-5706
> Project: Mesos
>  Issue Type: Task
>  Components: security
>Reporter: Adam B
>Assignee: Alexander Rojas
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> The master or agent flags are exposed in /state as well as /flags, so any 
> user who wants to disable/control access to the flags likely intends to 
> control access to flags no matter what endpoint exposes them. As such, /flags 
> is a poor candidate for GET_ENDPOINT_WITH_PATH authz, since we care more 
> about protecting the flag data than the specific endpoint path.
> We should remove the GET_ENDPOINT authz from master and agent /flags until we 
> can come up with a better solution, perhaps a first-class VIEW_FLAGS acl.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5749) Have maven run in batch mode

2016-06-30 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357542#comment-15357542
 ] 

haosdent commented on MESOS-5749:
-

Hi, [~drcrallen] may you elaborate why non-interactive mode is necessary here? 
I think we didn't require any user inputs during maven build.

> Have maven run in batch mode
> 
>
> Key: MESOS-5749
> URL: https://issues.apache.org/jira/browse/MESOS-5749
> Project: Mesos
>  Issue Type: Improvement
>  Components: java api
>Reporter: Charles Allen
>Priority: Minor
>
> Currently when the Makefile invokes maven, it does not use the -B flag. This 
> ask is to have maven use the -B flag to make it friendly for automated build 
> scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4833) Poor allocator performance with labeled resources and/or persistent volumes

2016-06-30 Thread Zhitao Li (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357491#comment-15357491
 ] 

Zhitao Li commented on MESOS-4833:
--

I'm sorry I accidentally clicked on something w/o realizing it. Please ignore 
it.

> Poor allocator performance with labeled resources and/or persistent volumes
> ---
>
> Key: MESOS-4833
> URL: https://issues.apache.org/jira/browse/MESOS-4833
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Neil Conway
>Assignee: Neil Conway
>Priority: Blocker
>  Labels: mesosphere, resources
> Fix For: 0.28.0
>
>
> Modifying the {{HierarchicalAllocator_BENCHMARK_Test.ResourceLabels}} 
> benchmark from https://reviews.apache.org/r/43686/ to use distinct labels 
> between different slaves, performance regresses from ~2 seconds to ~3 
> minutes. The culprit seems to be the way in which the allocator merges 
> together resources; reserved resource labels (or persistent volume IDs) 
> inhibit merging, which causes performance to be much worse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4833) Poor allocator performance with labeled resources and/or persistent volumes

2016-06-30 Thread Neil Conway (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357481#comment-15357481
 ] 

Neil Conway commented on MESOS-4833:


Hi [~zhitao] -- this issue was fixed a few months ago. If you believe there is 
more work to do here, can you clarify? Thanks.

> Poor allocator performance with labeled resources and/or persistent volumes
> ---
>
> Key: MESOS-4833
> URL: https://issues.apache.org/jira/browse/MESOS-4833
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Neil Conway
>Assignee: Zhitao Li
>Priority: Blocker
>  Labels: mesosphere, resources
> Fix For: 0.28.0
>
>
> Modifying the {{HierarchicalAllocator_BENCHMARK_Test.ResourceLabels}} 
> benchmark from https://reviews.apache.org/r/43686/ to use distinct labels 
> between different slaves, performance regresses from ~2 seconds to ~3 
> minutes. The culprit seems to be the way in which the allocator merges 
> together resources; reserved resource labels (or persistent volume IDs) 
> inhibit merging, which causes performance to be much worse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-4833) Poor allocator performance with labeled resources and/or persistent volumes

2016-06-30 Thread Zhitao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhitao Li reassigned MESOS-4833:


Assignee: Zhitao Li  (was: Neil Conway)

> Poor allocator performance with labeled resources and/or persistent volumes
> ---
>
> Key: MESOS-4833
> URL: https://issues.apache.org/jira/browse/MESOS-4833
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Neil Conway
>Assignee: Zhitao Li
>Priority: Blocker
>  Labels: mesosphere, resources
> Fix For: 0.28.0
>
>
> Modifying the {{HierarchicalAllocator_BENCHMARK_Test.ResourceLabels}} 
> benchmark from https://reviews.apache.org/r/43686/ to use distinct labels 
> between different slaves, performance regresses from ~2 seconds to ~3 
> minutes. The culprit seems to be the way in which the allocator merges 
> together resources; reserved resource labels (or persistent volume IDs) 
> inhibit merging, which causes performance to be much worse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5702) CNI documentation example is not explicit enough about external plugins

2016-06-30 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5702:
--
Assignee: Philip Winder

> CNI documentation example is not explicit enough about external plugins
> ---
>
> Key: MESOS-5702
> URL: https://issues.apache.org/jira/browse/MESOS-5702
> Project: Mesos
>  Issue Type: Documentation
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>Assignee: Philip Winder
>
> I'm testing Mesos 1.0.0-rc1 with Weave CNI. When I switched back to the CNI 
> example stated in the docs and restarted mesos-slave, I received a strange 
> error about not being able to find hadoop.
> I think that it's related to this issue: 
> https://issues.apache.org/jira/browse/MESOS-5669
> I thought I'd log the issue, but if it has been fixed by the issue above, 
> feel free to close.
> The setup, state and logs can be found here: 
> https://gist.github.com/philwinder/8f4c652723fa5c374b86a5e440bf4330



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5401) Add ability to inject a Volume of Nvidia GPU-related libraries into a docker container.

2016-06-30 Thread Kevin Klues (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357288#comment-15357288
 ] 

Kevin Klues commented on MESOS-5401:


The crux of the issue is that we need to be able consolidate all of the Nvidia 
binaries/libraries into a single volume that we inject into
a docker container. The goal is to let users leverage the nvidia Docker images 
(https://hub.docker.com/r/nvidia/) without any added effort on their behalf. 
Using docker they are able to launch containers from these images by simply 
running `nvidia-docker run ...` (i.e. they are
unaware that a magic volume is being injected on their behalf). On Mesos we 
want the experience to be similar.

The information at the link below should make things a bit more clear as to why 
this is necessary:

https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver.

We originally planned on building this functionality as an isolator module, but 
there some some limitations with the current isolator interface that prohibit 
us from doing this properly. Building it as an isolator module would also mean 
that it couldn't be shared by the docker containerizer (which we plan to add 
support for in the future).

> Add ability to inject a Volume of Nvidia GPU-related libraries into a docker 
> container.
> ---
>
> Key: MESOS-5401
> URL: https://issues.apache.org/jira/browse/MESOS-5401
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>
> In order to support Nvidia GPUs with docker containers in Mesos, we need to 
> be able to consolidate all Nvidia libraries into a common volume and inject 
> that volume into the container.
> More info on why this is necessary here: 
> https://github.com/NVIDIA/nvidia-docker/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4659) Avoid leaving orphan task after framework failure + master failover

2016-06-30 Thread Neil Conway (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4659:
---
Summary: Avoid leaving orphan task after framework failure + master 
failover  (was: Consider how to handle orphaned tasks after master failover)

> Avoid leaving orphan task after framework failure + master failover
> ---
>
> Key: MESOS-4659
> URL: https://issues.apache.org/jira/browse/MESOS-4659
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Neil Conway
>  Labels: failover, mesosphere
>
> If a framework becomes disconnected from the master, its tasks are killed 
> after waiting for {{failover_timeout}}.
> However, if a master failover occurs but a framework never reconnects to the 
> new master, we never kill any of the tasks associated with that framework. 
> These tasks remain orphaned and presumably would need to be manually removed 
> by the operator.
> We should consider whether to kill such orphaned tasks automatically, likely 
> after waiting for some (framework-configurable?) timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5487) Implement LIST_FILES Call in v1 master API.

2016-06-30 Thread Abhishek Dasgupta (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357118#comment-15357118
 ] 

Abhishek Dasgupta commented on MESOS-5487:
--

Discarding old reviews due to complete reorganization and design changes:

New Reviews are :
https://reviews.apache.org/r/49443/
https://reviews.apache.org/r/49444/
https://reviews.apache.org/r/49445/
https://reviews.apache.org/r/49446/
https://reviews.apache.org/r/49447/
https://reviews.apache.org/r/49448/

> Implement LIST_FILES Call in v1 master API.
> ---
>
> Key: MESOS-5487
> URL: https://issues.apache.org/jira/browse/MESOS-5487
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Abhishek Dasgupta
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5514) Implement LIST_FILES Call in v1 agent API.

2016-06-30 Thread Abhishek Dasgupta (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357119#comment-15357119
 ] 

Abhishek Dasgupta commented on MESOS-5514:
--

Discarding my old reviews due to complete reorganization and design changes:

New Reviews are :
https://reviews.apache.org/r/49443/
https://reviews.apache.org/r/49444/
https://reviews.apache.org/r/49445/
https://reviews.apache.org/r/49446/
https://reviews.apache.org/r/49447/
https://reviews.apache.org/r/49448/

> Implement LIST_FILES Call in v1 agent API.
> --
>
> Key: MESOS-5514
> URL: https://issues.apache.org/jira/browse/MESOS-5514
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Abhishek Dasgupta
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5702) CNI documentation example is not explicit enough about external plugins

2016-06-30 Thread Philip Winder (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356937#comment-15356937
 ] 

Philip Winder commented on MESOS-5702:
--

Thanks. I didn't receive an email response. Submitted.

> CNI documentation example is not explicit enough about external plugins
> ---
>
> Key: MESOS-5702
> URL: https://issues.apache.org/jira/browse/MESOS-5702
> Project: Mesos
>  Issue Type: Documentation
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> I'm testing Mesos 1.0.0-rc1 with Weave CNI. When I switched back to the CNI 
> example stated in the docs and restarted mesos-slave, I received a strange 
> error about not being able to find hadoop.
> I think that it's related to this issue: 
> https://issues.apache.org/jira/browse/MESOS-5669
> I thought I'd log the issue, but if it has been fixed by the issue above, 
> feel free to close.
> The setup, state and logs can be found here: 
> https://gist.github.com/philwinder/8f4c652723fa5c374b86a5e440bf4330



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5702) CNI documentation example is not explicit enough about external plugins

2016-06-30 Thread Philip Winder (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356765#comment-15356765
 ] 

Philip Winder commented on MESOS-5702:
--

Jie,
Following the (incredibly complicated - why should I have to ask permission to 
submit a PR!!) instructions at 
http://mesos.apache.org/documentation/latest/submitting-a-patch/, I've sent a 
mail to d...@mesos.apache.org to try to assign myself to this task. But I 
haven't had a reply.

Is this necessary, or can I just go ahead and submit straight to the 
reviewboard?

Thanks.

> CNI documentation example is not explicit enough about external plugins
> ---
>
> Key: MESOS-5702
> URL: https://issues.apache.org/jira/browse/MESOS-5702
> Project: Mesos
>  Issue Type: Documentation
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> I'm testing Mesos 1.0.0-rc1 with Weave CNI. When I switched back to the CNI 
> example stated in the docs and restarted mesos-slave, I received a strange 
> error about not being able to find hadoop.
> I think that it's related to this issue: 
> https://issues.apache.org/jira/browse/MESOS-5669
> I thought I'd log the issue, but if it has been fixed by the issue above, 
> feel free to close.
> The setup, state and logs can be found here: 
> https://gist.github.com/philwinder/8f4c652723fa5c374b86a5e440bf4330



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5751) Inconsistent display in webui

2016-06-30 Thread Jay Guo (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-5751:
---
Attachment: homepage.png

> Inconsistent display in webui
> -
>
> Key: MESOS-5751
> URL: https://issues.apache.org/jira/browse/MESOS-5751
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Reporter: Jay Guo
> Attachments: homepage.png
>
>
> To reproduce:
> 1. Launch master
> 2. Launch agent
> 3. Launch test-framework
> 4. go to webui
> We observe correct statistics on the left panel but no completed tasks on 
> right side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5751) Inconsistent display in webui

2016-06-30 Thread Jay Guo (JIRA)

Jay Guo created MESOS-5751:
--

 Summary: Inconsistent display in webui
 Key: MESOS-5751
 URL: https://issues.apache.org/jira/browse/MESOS-5751
 Project: Mesos
  Issue Type: Bug
  Components: webui
Reporter: Jay Guo


To reproduce:
1. Launch master
2. Launch agent
3. Launch test-framework
4. go to webui

We observe correct statistics on the left panel but no completed tasks on right 
side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5401) Add ability to inject a Volume of Nvidia GPU-related libraries into a docker container.

2016-06-30 Thread Sunzhe (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356662#comment-15356662
 ] 

Sunzhe commented on MESOS-5401:
---

This issue you mean, take the Nvidia GPU-related libraries as a Volume and 
Mount to a docker container, is it?

> Add ability to inject a Volume of Nvidia GPU-related libraries into a docker 
> container.
> ---
>
> Key: MESOS-5401
> URL: https://issues.apache.org/jira/browse/MESOS-5401
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>
> In order to support Nvidia GPUs with docker containers in Mesos, we need to 
> be able to consolidate all Nvidia libraries into a common volume and inject 
> that volume into the container.
> More info on why this is necessary here: 
> https://github.com/NVIDIA/nvidia-docker/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5700) Benchmark for Resource class (protobuf vs. C++)

2016-06-30 Thread Klaus Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356615#comment-15356615
 ] 

Klaus Ma commented on MESOS-5700:
-

Here's benchmark for {{operator+}} & {{operator+=}}

{code}
[ RUN  ] 
ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_AddAndAssign/5
Added 100 resources (cpus:1) in 437384us
[   OK ] 
ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_AddAndAssign/5 (438 ms)
[ RUN  ] 
ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_AddAndAssign/11
Added 100 resources (cpus:1;mem:2) in 826587us
[   OK ] 
ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_AddAndAssign/11 (826 
ms)
[ RUN  ] 
ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_AddAndAssign/17
Added 100 resources (cpus:1;ports:[1-100]) in 1.944934secs
[   OK ] 
ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_AddAndAssign/17 (1945 
ms)


[ RUN  ] ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_Add/5
Added 100 resources (cpus:1) in 1.368948secs
[   OK ] ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_Add/5 
(1369 ms)
[ RUN  ] ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_Add/11
Added 100 resources (cpus:1;mem:2) in 2.734078secs
[   OK ] ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_Add/11 
(2734 ms)
[ RUN  ] ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_Add/17
Added 100 resources (cpus:1;ports:[1-100]) in 4.410165secs
[   OK ] ResourcesOperatorCount/Resources_BENCHMARK_Test.Operator_Add/17 
(4410 ms)

{code}

> Benchmark for Resource class (protobuf vs. C++)
> ---
>
> Key: MESOS-5700
> URL: https://issues.apache.org/jira/browse/MESOS-5700
> Project: Mesos
>  Issue Type: Bug
>Reporter: Klaus Ma
>Assignee: Klaus Ma
>
> Add benchmark of Resource class for Allocation Performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

66 matches

Mail list logo