[jira] [Commented] (MESOS-6918) Prometheus exporter endpoints for metrics

2017-10-06 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195412#comment-16195412
 ] 

James Peach commented on MESOS-6918:


Summary from our discussion:

- retain the existing {{Timer}} value that holds the duration of the last sample
- capture total duration (monotonic sum) for {{Timer}}s in their time series
- capture total sample count for {{Timer}}s in their time series
- replace the {{Semantics}} enum with a {{monotonic}} marker (enum or bool or 
something)

> Prometheus exporter endpoints for metrics
> -
>
> Key: MESOS-6918
> URL: https://issues.apache.org/jira/browse/MESOS-6918
> Project: Mesos
>  Issue Type: Bug
>  Components: statistics
>Reporter: James Peach
>Assignee: James Peach
>
> There are a couple of [Prometheus|https://prometheus.io] metrics exporters 
> for Mesos, of varying quality. Since the Mesos stats system actually knows 
> about statistics data types and semantics, and Mesos has reasonable HTTP 
> support we could add Prometheus metrics endpoints to directly expose 
> statistics in [Prometheus wire 
> format|https://prometheus.io/docs/instrumenting/exposition_formats/], 
> removing the need for operators to run separate exporter processes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7990) Support systemd named hierarchy (name=systemd) for Mesos Containerizer.

2017-10-06 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195300#comment-16195300
 ] 

Jie Yu commented on MESOS-7990:
---

Some sample output:
{noformat}
[jie@core-dev ~]$ systemd-cgls
|-1 /usr/lib/systemd/systemd --system --deserialize 20
|-mesos
|  |-8282b91a-5724-4964-a623-7c6bd68ff4ad
|  |-31737 /usr/libexec/mesos/mesos-containerizer launch
|  |-31739 mesos-default-executor --launcher_dir=/usr/libexec/mesos
|  |-mesos
| |-8555f4af-fa4f-4c9c-aeb3-0c9f72e6a2de
|   |-31791 /usr/libexec/mesos/mesos-containerizer launch
|   |-31793 sleep 1000
{noformat}

> Support systemd named hierarchy (name=systemd) for Mesos Containerizer.
> ---
>
> Key: MESOS-7990
> URL: https://issues.apache.org/jira/browse/MESOS-7990
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Jie Yu
>Assignee: Jie Yu
>
> Similar to docker's cgroupfs cgroup driver, we should create cgroups under 
> /sys/fs/cgroup/systemd (if it exists), and move container pid into the 
> corresponding cgroup ( /sys/fs/cgroup/systemd/mesos/).
> This can give us a bunch of benefits:
> 1) systemd-cgls can list mesos containers
> 2) systemd-cgtop can show stats for mesos containers
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8060) Introduce first class 'profile' for disk resources.

2017-10-06 Thread Jie Yu (JIRA)
Jie Yu created MESOS-8060:
-

 Summary: Introduce first class 'profile' for disk resources.
 Key: MESOS-8060
 URL: https://issues.apache.org/jira/browse/MESOS-8060
 Project: Mesos
  Issue Type: Task
Reporter: Jie Yu
Assignee: Jie Yu


This is similar to storage classes. Instead of adding a bunch of storage 
backend specific parameters (e.g., rotational, type, speed, etc.) into the disk 
resources, and asking the frameworks to make scheduling decisions based on 
those vendor specific parameters. We propose to use a level of indirection here.

The operator will setup mappings between a profile name to a set of vendor 
specific disk parameters. The framework will do disk selection based on profile 
names.

The storage resource provider will provide a hook allowing operators to 
customize the profile name assignment for disk resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7944) Implement jemalloc support for Mesos

2017-10-06 Thread Benno Evers (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benno Evers updated MESOS-7944:
---
Sprint: Mesosphere Sprint 63, Mesosphere Sprint 65  (was: Mesosphere Sprint 
63)

> Implement jemalloc support for Mesos
> 
>
> Key: MESOS-7944
> URL: https://issues.apache.org/jira/browse/MESOS-7944
> Project: Mesos
>  Issue Type: Bug
>Reporter: Benno Evers
>Assignee: Benno Evers
>
> After investigation in MESOS-7876 and discussion on the mailing list, this 
> task is for tracking progress on adding out-of-the-box memory profiling 
> support using jemalloc to Mesos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7941) Send TASK_STARTING status from built-in executors

2017-10-06 Thread Benno Evers (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benno Evers updated MESOS-7941:
---
Sprint: Mesosphere Sprint 65

> Send TASK_STARTING status from built-in executors
> -
>
> Key: MESOS-7941
> URL: https://issues.apache.org/jira/browse/MESOS-7941
> Project: Mesos
>  Issue Type: Bug
>Reporter: Benno Evers
>Assignee: Benno Evers
>
> All executors have the option to send out a TASK_STARTING status update to 
> signal to the scheduler that they received the command to launch the task.
> It would be good if our built-in executors would do this, for reasons laid 
> out in 
> https://mail-archives.apache.org/mod_mbox/mesos-dev/201708.mbox/%3CCA%2B9TLTzkEVM0CKvY%2B%3D0%3DwjrN6hYFAt0401Y7b8tysDWx1WZzdw%40mail.gmail.com%3E
> This will also fix MESOS-6790.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-5309) PortMappingIsolatorTest.ROOT_NC_ContainerToContainerTCP failed.

2017-10-06 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194995#comment-16194995
 ] 

James Peach commented on MESOS-5309:


On my Fedora 26 system, this reliably fails with:
{noformat}
+ mount --make-rslave /run/netns
+ test -f /proc/sys/net/ipv6/conf/all/disable_ipv6
+ echo 1
+ ip link set lo address 00:3e:e1:c8:84:d1 mtu 1500 up
+ ethtool -K enp12s0 rx off
+ ip link set enp12s0 address 00:3e:e1:c8:84:d1 up
+ ip addr add 17.228.224.108/24 dev enp12s0
+ ip route add default via 17.228.224.1
+ echo 30016 30031
+ echo 1
+ echo 1
+ echo 1
+ '[' -f /proc/sys/net/ipv4/tcp_keepalive_time ']'
+ echo 7200
+ '[' -f /proc/sys/net/core/rmem_max ']'
+ '[' -f /proc/sys/net/ipv4/tcp_keepalive_intvl ']'
+ echo 75
+ '[' -f /proc/sys/net/core/somaxconn ']'
+ echo 128
+ '[' -f /proc/sys/net/core/wmem_max ']'
+ '[' -f /proc/sys/net/core/netdev_max_backlog ']'
+ '[' -f /proc/sys/net/ipv4/tcp_keepalive_probes ']'
+ echo 9
+ '[' -f /proc/sys/net/ipv4/tcp_max_syn_backlog ']'
+ echo 2048
+ '[' -f /proc/sys/net/ipv4/neigh/default/gc_thresh2 ']'
+ '[' -f /proc/sys/net/ipv4/neigh/default/gc_thresh3 ']'
+ '[' -f /proc/sys/net/ipv4/tcp_wmem ']'
+ '[' -f /proc/sys/net/ipv4/neigh/default/gc_thresh1 ']'
+ '[' -f /proc/sys/net/ipv4/tcp_synack_retries ']'
+ echo 5
+ '[' -f /proc/sys/net/ipv4/tcp_rmem ']'
+ '[' -f /proc/sys/net/ipv4/tcp_retries2 ']'
+ echo 15
+ tc qdisc add dev lo ingress
+ tc qdisc add dev enp12s0 ingress
+ tc filter add dev lo parent :0 protocol ip prio 770 u32 flowid :0 
match ip dst 17.228.224.108 action mirred egress redirect dev enp12s0
+ tc filter add dev lo parent :0 protocol ip prio 770 u32 flowid :0 
match ip dst 127.0.0.1 action mirred egress redirect dev enp12s0
+ tc filter add dev lo parent :0 protocol ip prio 769 u32 flowid :0 
match ip dport 30016 fff0
+ tc filter add dev enp12s0 parent :0 protocol ip prio 770 u32 flowid 
:0 match ip dst 127.0.0.1 match ip dport 30016 fff0 action mirred egress 
redirect dev lo
+ tc filter add dev lo parent :0 protocol ip prio 769 u32 flowid :0 
match ip dport 31000 fff8
+ tc filter add dev enp12s0 parent :0 protocol ip prio 770 u32 flowid 
:0 match ip dst 127.0.0.1 match ip dport 31000 fff8 action mirred egress 
redirect dev lo
+ tc filter add dev lo parent :0 protocol ip prio 769 u32 flowid :0 
match ip dport 31008 ffe0
+ tc filter add dev enp12s0 parent :0 protocol ip prio 770 u32 flowid 
:0 match ip dst 127.0.0.1 match ip dport 31008 ffe0 action mirred egress 
redirect dev lo
+ tc filter add dev lo parent :0 protocol ip prio 769 u32 flowid :0 
match ip dport 31040 ffc0
+ tc filter add dev enp12s0 parent :0 protocol ip prio 770 u32 flowid 
:0 match ip dst 127.0.0.1 match ip dport 31040 ffc0 action mirred egress 
redirect dev lo
+ tc filter add dev lo parent :0 protocol ip prio 769 u32 flowid :0 
match ip dport 31104 ff80
+ tc filter add dev enp12s0 parent :0 protocol ip prio 770 u32 flowid 
:0 match ip dst 127.0.0.1 match ip dport 31104 ff80 action mirred egress 
redirect dev lo
+ tc filter add dev lo parent :0 protocol ip prio 769 u32 flowid :0 
match ip dport 31232 ff00
+ tc filter add dev enp12s0 parent :0 protocol ip prio 770 u32 flowid 
:0 match ip dst 127.0.0.1 match ip dport 31232 ff00 action mirred egress 
redirect dev lo
+ tc filter add dev lo parent :0 protocol ip prio 769 u32 flowid :0 
match ip dport 31488 fff8
+ tc filter add dev enp12s0 parent :0 protocol ip prio 770 u32 flowid 
:0 match ip dst 127.0.0.1 match ip dport 31488 fff8 action mirred egress 
redirect dev lo
+ tc filter add dev lo parent :0 protocol ip prio 769 u32 flowid :0 
match ip dport 31496 fffc
+ tc filter add dev enp12s0 parent :0 protocol ip prio 770 u32 flowid 
:0 match ip dst 127.0.0.1 match ip dport 31496 fffc action mirred egress 
redirect dev lo
+ tc filter add dev lo parent :0 protocol ip prio 514 u32 flowid :0 
match ip protocol 1 0xff match ip dst 17.228.224.108
+ tc filter add dev lo parent :0 protocol ip prio 514 u32 flowid :0 
match ip protocol 1 0xff match ip dst 127.0.0.1
+ tc filter show dev enp12s0 parent :0
filter protocol ip pref 770 u32
filter protocol ip pref 770 u32 fh 800: ht divisor 1
filter protocol ip pref 770 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 
:
  match 7f01/ at 16
  match 7540/fff0 at 20
action order 1: mirred (Egress Redirect to device lo) stolen
index 3 ref 1 bind 1

filter protocol ip pref 770 u32 fh 800::801 order 2049 key ht 800 bkt 0 flowid 
:
  match 7f01/ at 16
  match 7918/fff8 at 20
action order 1: mirred (Egress Redirect to device lo) stolen
index 4 ref 1 bind 1

filter protocol ip pref 770 u32 fh 800::802 order 2050 key ht 800 bkt 0 flowid 
:
  match 7f01/ at 16
  match 

[jira] [Updated] (MESOS-8058) Agent and master can race when updating agent state

2017-10-06 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-8058:
---
Description: 
In {{2af9a5b07dc80151154264e974d03f56a1c25838}} we introduce the use of 
{{UpdateSlaveMessage}} for the agent to inform the master about its current 
total resources. Currently we trigger this message only on agent registration 
and reregistration.

This can race with operations applied in the master and communicated via 
{{CheckpointResourcesMessage}}.

Example:

1. Agent ({{cpus:4(\*)}} registers.
2. Master is triggered to apply an operation to the agent's resources, e.g., a 
reservation: {{cpus:4(\*) -> cpus:4(A)}}. The master applies the operation to 
its current view of the agent's resources and sends the agent a 
{{CheckpointResourcesMessage}} so the agent can persist the result.
3. The agent sends the master an {{UpdateSlaveMessage}}, e.g., {{cpus:4(\*)}} 
since it hasn't received the {{CheckpointResourcesMessage}} yet.
4. The master processes the {{UpdateSlaveMessage}} and updates its view of the 
agent's resources to be {{cpus:4(\*)}}.
5. The agent processes the {{CheckpointResourcesMessage}} and updates its view 
of its resources to be {{cpus:4(A)}}.
6. The agent and the master have an inconsistent view of the agent's resources.

  was:
In {{2af9a5b07dc80151154264e974d03f56a1c25838}} we introduce the use of 
{{UpdateSlaveMessage}} for the agent to inform the master about its current 
total resources. Currently we trigger this message only on agent registration 
and reregistration.

This can race with operations applied in the master and communicated via 
{{CheckpointResourcesMessage}}.

Example:

1. Agent ({{cpus:4(\*)}} registers.
2. Master is triggered to apply an operation to the agent's resources, e.g., a 
reservation: {{cpus:4(\*) -> cpus:4(A)}}. The master applies the operation to 
its current view of the agent's resources and sends the agent a 
{{CheckpointResourcesMessage}} so the agent can persist the result.
3. The agent send the master an {{UpdateSlaveMessage}}, e.g., {{cpus:4(\*)}} 
since it hasn't received the {{CheckpointResourcesMessage}} yet.
4. The master processes the {{UpdateSlaveMessage}} and updates its view of the 
agent's resources to be {{cpus:4(\*)}}.
5. The agent processes the {{CheckpointResourcesMessage}} and updates its view 
of its resources to be {{cpus:4(A)}}.
6. The agent and the master have an inconsistent view of the agent's resources.


> Agent and master can race when updating agent state
> ---
>
> Key: MESOS-8058
> URL: https://issues.apache.org/jira/browse/MESOS-8058
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Affects Versions: 1.5.0
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Critical
>  Labels: mesosphere
>
> In {{2af9a5b07dc80151154264e974d03f56a1c25838}} we introduce the use of 
> {{UpdateSlaveMessage}} for the agent to inform the master about its current 
> total resources. Currently we trigger this message only on agent registration 
> and reregistration.
> This can race with operations applied in the master and communicated via 
> {{CheckpointResourcesMessage}}.
> Example:
> 1. Agent ({{cpus:4(\*)}} registers.
> 2. Master is triggered to apply an operation to the agent's resources, e.g., 
> a reservation: {{cpus:4(\*) -> cpus:4(A)}}. The master applies the operation 
> to its current view of the agent's resources and sends the agent a 
> {{CheckpointResourcesMessage}} so the agent can persist the result.
> 3. The agent sends the master an {{UpdateSlaveMessage}}, e.g., {{cpus:4(\*)}} 
> since it hasn't received the {{CheckpointResourcesMessage}} yet.
> 4. The master processes the {{UpdateSlaveMessage}} and updates its view of 
> the agent's resources to be {{cpus:4(\*)}}.
> 5. The agent processes the {{CheckpointResourcesMessage}} and updates its 
> view of its resources to be {{cpus:4(A)}}.
> 6. The agent and the master have an inconsistent view of the agent's 
> resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8059) Support for multiple authentication schemes via HTTP.

2017-10-06 Thread Till Toenshoff (JIRA)
Till Toenshoff created MESOS-8059:
-

 Summary: Support for multiple authentication schemes via HTTP. 
 Key: MESOS-8059
 URL: https://issues.apache.org/jira/browse/MESOS-8059
 Project: Mesos
  Issue Type: Improvement
  Components: libprocess
Reporter: Till Toenshoff


As per [RFC7230|https://tools.ietf.org/html/rfc7230#section-3.2.2], HTTP 
authentication does support using multiple schemes in a single 
{{Authorization}} header. Our current implementations do not seem to support 
this; namely the libprocess basic authenticator does assume a single scheme.
The above RFC also says explicitly that we must never have multiple 
{{Authorization}} headers in the same request but must combine them.
[RFC2617|http://www.ietf.org/rfc/rfc2617.txt] then has additional information 
on how to properly react upon multiple authentication schemes (also via proxy 
auth).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8058) Agent and master can race when updating agent state

2017-10-06 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-8058:
---

 Summary: Agent and master can race when updating agent state
 Key: MESOS-8058
 URL: https://issues.apache.org/jira/browse/MESOS-8058
 Project: Mesos
  Issue Type: Bug
  Components: agent
Affects Versions: 1.5.0
Reporter: Benjamin Bannier
Assignee: Benjamin Bannier
Priority: Critical


In {{2af9a5b07dc80151154264e974d03f56a1c25838}} we introduce the use of 
{{UpdateSlaveMessage}} for the agent to inform the master about its current 
total resources. Currently we trigger this message only on agent registration 
and reregistration.

This can race with operations applied in the master and communicated via 
{{CheckpointResourcesMessage}}.

Example:

1. Agent ({{cpus:4(\*)}} registers.
2. Master is triggered to apply an operation to the agent's resources, e.g., a 
reservation: {{cpus:4(\*) -> cpus:4(A)}}. The master applies the operation to 
its current view of the agent's resources and sends the agent a 
{{CheckpointResourcesMessage}} so the agent can persist the result.
3. The agent send the master an {{UpdateSlaveMessage}}, e.g., {{cpus:4(\*)}} 
since it hasn't received the {{CheckpointResourcesMessage}} yet.
4. The master processes the {{UpdateSlaveMessage}} and updates its view of the 
agent's resources to be {{cpus:4(\*)}}.
5. The agent processes the {{CheckpointResourcesMessage}} and updates its view 
of its resources to be {{cpus:4(A)}}.
6. The agent and the master have an inconsistent view of the agent's resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-6918) Prometheus exporter endpoints for metrics

2017-10-06 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194713#comment-16194713
 ] 

James Peach commented on MESOS-6918:


{quote}
...all metric types (potentially future ones as well) ... But are we sure about 
this is the right/only criterion? (The examples cited in the design doc don't 
consistently define this and none defines it as "semantics") Could there be 
other dimensions / features to classify metrics?
{quote}

Obviously there could be a large number of ways to classify how metrics behave. 
However 2 is sufficient for all the metrics we have today. I am not trying to 
boil the ocean here and invent the high temple of metrics systems; I'm just 
trying to improve the one we already have. I think you are willfully reading 
too much into this.

{quote}
We already have metric types that are called Counter and Gauge
{quote}

As I explained in person (and maybe in the design doc?), the class type is not 
the preferred implementation because there are more class types than semantic 
types. It's perfectly reasonable to use {{Counter}} to publish a metric that 
has {{GAUGE}} semantics, and obviously we already use {{GAUGE}} in both 
{{Counter}} and {{Timer}} class. It's pretty clear that we should separate the 
implementation classes from the logical model.

{quote}
 keep the MetricsProcess logic generic.
{quote}

{{MetricsProcess}} is not generic. It emits a specific format. I'm fine with 
shuffling code around, but there's so little it isn't worth a separate 
abstraction IMHO.

{quote}
I think we can keep the meaning the existing field Timer.value() (the last 
sampled value). 
{quote}

As I explained in the doc, this value is not helpful at all. I don't think 
there's any point in keeping it.

{quote}
 provide Prometheus its required info.
{quote}

Making {{Timer}} a {{COUNTER}} is not specific to Prometheus, it is necessary 
to make {{Timer}} values useful at all.

> Prometheus exporter endpoints for metrics
> -
>
> Key: MESOS-6918
> URL: https://issues.apache.org/jira/browse/MESOS-6918
> Project: Mesos
>  Issue Type: Bug
>  Components: statistics
>Reporter: James Peach
>Assignee: James Peach
>
> There are a couple of [Prometheus|https://prometheus.io] metrics exporters 
> for Mesos, of varying quality. Since the Mesos stats system actually knows 
> about statistics data types and semantics, and Mesos has reasonable HTTP 
> support we could add Prometheus metrics endpoints to directly expose 
> statistics in [Prometheus wire 
> format|https://prometheus.io/docs/instrumenting/exposition_formats/], 
> removing the need for operators to run separate exporter processes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7997) ContentType/MasterAPITest.CreateAndDestroyVolumes is flaky.

2017-10-06 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194534#comment-16194534
 ] 

Alexander Rukletsov commented on MESOS-7997:


I was able to reproduce an instance of this issue locally with some extra logs 
added. As we can see below, sometimes {{updateAvailable}} call can sneak 
in-between and lead to an offer without persistent volume, even though we know 
the volume has been created. Compare offer from a "bad" run:
{noformat}
[{"allocation_info":{"role":"role1"},"name":"disk","reservations":[{"role":"role1","type":"STATIC"}],"scalar":{"value":1024.0},"type":"SCALAR"},{"allocation_info":{"role":"role1"},"name":"cpus","scalar":{"value":8.0},"type":"SCALAR"},{"allocation_info":{"role":"role1"},"name":"mem","scalar":{"value":15360.0},"type":"SCALAR"},{"allocation_info":{"role":"role1"},"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
Allocated: disk(allocated: role1)(reservations: [(STATIC,role1)])[id1:path1]:64
{noformat}
and from a "good" run:
{noformat}
[{"allocation_info":{"role":"role1"},"name":"disk","reservations":[{"role":"role1","type":"STATIC"}],"scalar":{"value":960.0},"type":"SCALAR"},{"allocation_info":{"role":"role1"},"name":"cpus","scalar":{"value":8.0},"type":"SCALAR"},{"allocation_info":{"role":"role1"},"name":"mem","scalar":{"value":15360.0},"type":"SCALAR"},{"allocation_info":{"role":"role1"},"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"},{"allocation_info":{"role":"role1"},"disk":{"persistence":{"id":"id1","principal":"test-principal"},"volume":{"container_path":"path1","mode":"RW"}},"name":"disk","reservations":[{"role":"role1","type":"STATIC"}],"scalar":{"value":64.0},"type":"SCALAR"}]
Allocated: disk(allocated: role1)(reservations: [(STATIC,role1)])[id1:path1]:64
{noformat}

A closer look at a "bad" run reveals a race. An agent [sends an 
update|https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L1260-L1265]
 with its resources after it got registered with the master, which arrives on 
the master _after_ a persistent volume has been created. We can fix the test by 
waiting for the {{UpdateSlaveMessage}} to arrive before requesting a volume, 
but I think we should fix the race in the agent-master communication. A change 
has been likely introduced in 
[2af9a5b07dc80151154264e974d03f56a1c25838|https://github.com/apache/mesos/commit/2af9a5b07dc80151154264e974d03f56a1c25838]
{noformat}
I1005 17:30:05.343058 2138112 hierarchical.cpp:593] Added agent 
c13b88fc-9b34-4d5e-9641-e4faec3190f1-S0 (localhost) with disk(reservations: 
[(STATIC,role1)]):1024; cpus:8; mem:15360; ports:[31000-32000] (allocated: {})
I1005 17:30:05.343085 528384 status_update_manager.cpp:184] Resuming sending 
status updates
I1005 17:30:05.342851 4284416 master.cpp:6032] Registered agent 
c13b88fc-9b34-4d5e-9641-e4faec3190f1-S0 at slave(459)@127.0.0.1:60330 
(localhost) with 
[{"name":"disk","reservations":[{"role":"role1","type":"STATIC"}],"scalar":{"value":1024.0},"type":"SCALAR"},{"name":"cpus","scalar":{"value":8.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":15360.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
I1005 17:30:05.343230 2138112 hierarchical.cpp:1945] No allocations performed
I1005 17:30:05.343252 2138112 hierarchical.cpp:1488] Performed allocation for 1 
agents in 94us
I1005 17:30:05.344882 3747840 process.cpp:3929] Handling HTTP event for process 
'master' with path: '/master/api/v1'
I1005 17:30:05.345293 2674688 slave.cpp:1213] Checkpointing SlaveInfo to 
'/var/folders/h3/8j18s1cx2bn78ms99d3lz4jhgq/T/ContentType_MasterAPITest_CreateAndDestroyVolumes_1_RWSUpI/meta/slaves/c13b88fc-9b34-4d5e-9641-e4faec3190f1-S0/slave.info'
I1005 17:30:05.345422 2138112 http.cpp:1185] HTTP POST for /master/api/v1 from 
127.0.0.1:61576
I1005 17:30:05.345844 2138112 http.cpp:673] Processing call CREATE_VOLUMES
I1005 17:30:05.346055 2138112 master.cpp:3758] Authorizing principal 
'test-principal' to create volumes 
'[{"disk":{"persistence":{"id":"id1","principal":"test-principal"},"volume":{"container_path":"path1","mode":"RW"}},"name":"disk","reservations":[{"role":"role1","type":"STATIC"}],"scalar":{"value":64.0},"type":"SCALAR"}]'
I1005 17:30:05.347273 4284416 master.cpp:9335] Sending updated checkpointed 
resources disk(reservations: [(STATIC,role1)])[id1:path1]:64 to agent 
c13b88fc-9b34-4d5e-9641-e4faec3190f1-S0 at slave(459)@127.0.0.1:60330 
(localhost)
W1005 17:30:05.347931 2025730048 sched.cpp:1711] 
**
Scheduler driver bound to loopback interface! Cannot communicate with remote 
master(s). You might want to set 'LIBPROCESS_IP' environment variable to use a 
routable IP address.
**
W1005 17:30:05.347995 2025730048 process.cpp:3194] Attempted to spawn already 

[jira] [Commented] (MESOS-7957) The REGISTER_FRAMEWORK_WITH_ROLE does not use in source code

2017-10-06 Thread Alexander Rojas (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194402#comment-16194402
 ] 

Alexander Rojas commented on MESOS-7957:


{{REGISTER_FRAMEWORK_WITH_ROLE}} as well as all other actions using the 
{{_WITH_*}} suffixes are deprecated since November 2016. That means that our 
six months deprecation cycle has finished.

I think we can move forward with this, but not only with the for the action in 
particular, but with all the deprecated ones, so probably we need to change 
this issue title and description. If necessary, I could shepherd this.

> The REGISTER_FRAMEWORK_WITH_ROLE does not use in source code
> 
>
> Key: MESOS-7957
> URL: https://issues.apache.org/jira/browse/MESOS-7957
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Reporter: jackyoh
>Priority: Trivial
>
> Mesos test code has the REGISTER_FRAMEWORK_WITH_ROLE action in 
> src/test/authorization_tests.cpp, but the source code does not
> use the REGISTER_FRAMEWORK_WITH_ROLE action.
> Can I remove the REGISTER_FRAMEWORK_WITH_ROLE action in 
> src/tests/authorization_tests.cpp?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (MESOS-7957) The REGISTER_FRAMEWORK_WITH_ROLE does not use in source code

2017-10-06 Thread Alexander Rojas (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194402#comment-16194402
 ] 

Alexander Rojas edited comment on MESOS-7957 at 10/6/17 10:17 AM:
--

{{REGISTER_FRAMEWORK_WITH_ROLE}} as well as all other actions using the 
{{\_WITH\_*}} suffixes are deprecated since November 2016. That means that our 
six months deprecation cycle has finished.

I think we can move forward with this, but not only with the for the action in 
particular, but with all the deprecated ones, so probably we need to change 
this issue title and description. If necessary, I could shepherd this.


was (Author: arojas):
{{REGISTER_FRAMEWORK_WITH_ROLE}} as well as all other actions using the 
{{_WITH_*}} suffixes are deprecated since November 2016. That means that our 
six months deprecation cycle has finished.

I think we can move forward with this, but not only with the for the action in 
particular, but with all the deprecated ones, so probably we need to change 
this issue title and description. If necessary, I could shepherd this.

> The REGISTER_FRAMEWORK_WITH_ROLE does not use in source code
> 
>
> Key: MESOS-7957
> URL: https://issues.apache.org/jira/browse/MESOS-7957
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Reporter: jackyoh
>Priority: Trivial
>
> Mesos test code has the REGISTER_FRAMEWORK_WITH_ROLE action in 
> src/test/authorization_tests.cpp, but the source code does not
> use the REGISTER_FRAMEWORK_WITH_ROLE action.
> Can I remove the REGISTER_FRAMEWORK_WITH_ROLE action in 
> src/tests/authorization_tests.cpp?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8057) Apply security patches to AngularJS and JQuery in the Mesos UI

2017-10-06 Thread Alexander Rojas (JIRA)
Alexander Rojas created MESOS-8057:
--

 Summary: Apply security patches to AngularJS and JQuery in the 
Mesos UI
 Key: MESOS-8057
 URL: https://issues.apache.org/jira/browse/MESOS-8057
 Project: Mesos
  Issue Type: Bug
  Components: webui
Affects Versions: 1.4.0
Reporter: Alexander Rojas
Priority: Blocker


Running a security tool returns:

{noformat}
Evidence 
Vulnerable libraries were found: 

https://admin.kpn-dsh.com/mesos/static/js/angular-1.2.3.min.js 
https://admin.kpn-dsh.com/mesos/static/js/angular-route-1.2.3.min.js  
https://admin.kpn-dsh.com/mesos/static/js/jquery-1.7.1.min.js 

More information about the issues can be found at: - 
https://github.com/angular/angular.js/blob/master/CHANGELOG.md - 
http://bugs.jquery.com/ticket/11290 
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)