[jira] [Commented] (MESOS-6568) JSON serialization should not omit empty arrays in HTTP APIs

2020-04-22 Thread Benjamin Mahler (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089867#comment-17089867
 ] 

Benjamin Mahler commented on MESOS-6568:


The code is ready to ship but has not been committed yet (as you can see in the 
linked reviews), a user reported it broke some of their tests so I've been 
giving them time to fix their tests before I land this.

> JSON serialization should not omit empty arrays in HTTP APIs
> 
>
> Key: MESOS-6568
> URL: https://issues.apache.org/jira/browse/MESOS-6568
> Project: Mesos
>  Issue Type: Improvement
>  Components: HTTP API
>Reporter: Neil Conway
>Assignee: Benjamin Mahler
>Priority: Major
>  Labels: mesosphere
>
> When using the JSON content type with the HTTP APIs, a {{repeated}} protobuf 
> field is omitted entirely from the JSON serialization of the message. For 
> example, this is a response to the {{GetTasks}} call:
> {noformat}
> {
>   "get_tasks": {
> "tasks": [{...}]
>   },
>   "type": "GET_TASKS"
> }
> {noformat}
> I think it would be better to include empty arrays for the other fields of 
> the message ({{pending_tasks}}, {{completed_tasks}}, etc.). Advantages:
> # Consistency with the old HTTP endpoints, e.g., /state
> # Semantically, an empty array is more accurate. The master's response should 
> be interpreted as saying it doesn't know about any pending/completed tasks; 
> that is more accurately conveyed by explicitly including an empty array, not 
> by omitting the key entirely.
> *NOTE: The 
> [asV1Protobuf|https://github.com/apache/mesos/blob/d10a33acc426dda9e34db995f16450faf898bb3b/src/common/http.cpp#L172-L423]
>  copy needs to also be updated.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10122) cmake+MSBuild is uncapable of building all Mesos sources in parallel.

2020-04-22 Thread Andrei Sekretenko (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089857#comment-17089857
 ] 

Andrei Sekretenko commented on MESOS-10122:
---

Third option that doesn't have any of the outlined drawbacks: each directory of 
the tree can be built as an intermediate OBJECT library. 
Preliminary testing shows that this doesn't actually worsen parallelism 
compared to flattened tree.

> cmake+MSBuild is uncapable of building all Mesos sources in parallel.
> -
>
> Key: MESOS-10122
> URL: https://issues.apache.org/jira/browse/MESOS-10122
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Reporter: Andrei Sekretenko
>Assignee: Andrei Sekretenko
>Priority: Major
>
> When a library (in cmake's sense) contains several sources with different 
> paths but the same filename (for example,  slave/validation.cpp and 
> resource_provider/validation.cpp), the build generated by CMake for MSVC does 
> not allow for building those files in parallel (presumably, because the .obj 
> files will be located in the same directory).
> This has been tested observed with both cmake 3.9 and 3.17, with "Visual 
> Studio 15 2017 Win64" generator. 
> It seems to be a known behaviour - see 
> https://stackoverflow.com/questions/7033855/msvc10-mp-builds-not-multicore-across-folders-in-a-project.
> Two options for fixing this in a way that will work with these cmake/MSVC 
> configurations are:
>  - splitting the build into small static libraries (a library per directory)
>  - introducing an intermediate code-generation-like step optionally 
> flattening the directory structure (slave/validation.cpp -> 
> slave_validation.cpp)
> Both options have their drawbacks:
>  - The first will result in changing the layout the static build artifacts 
> (mesos.lib will be replaced with a ton of smaller libraries), that will pose 
> integration cahllenges,  and potentially will result in worse parallelism.
>  - The second will result in being unable to use #include without a path  
> (right now there are three or four such #include's in the whole 
> Mesos/libprocess code buildable on Windows) in changed value of __FILE__ 
> macro (as a consequence, in the example above, `validation.cpp` in logs will 
> be replaced either with `slave_validation.cpp` or with 
> `resource_provider_validation.cpp`)
> Note that the second approach will need to deal with potential collisions 
> when the source tree has filenames with underscores. If, for example, we had 
> both slave/validation.cpp and slave_validation.cpp, then either some 
> additional escaping will be needed or, alternatively, such layout could be 
> just forbidden (and made to fail the build).
> Preliminary testing shows that on a 8-core AWS instance flattening source 
> trees of libprocess, mesos-protobufs and libmesos results in clean build 
> speedup from ~1 hour to ~30 minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-8766) Implement configuring loglevel for health checks in built-in executors.

2020-04-22 Thread Brandon Hudgeons (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089765#comment-17089765
 ] 

Brandon Hudgeons commented on MESOS-8766:
-

On the version in master as of this comment, there are at least two places 
where this should be configurable: line 985 and lines 1154-1155. These fill 
logs with successful health checks. (But consider any call to `LOG(INFO)`)

> Implement configuring loglevel for health checks in built-in executors.
> ---
>
> Key: MESOS-8766
> URL: https://issues.apache.org/jira/browse/MESOS-8766
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Affects Versions: 1.5.0
>Reporter: Bart Jol
>Priority: Minor
>  Labels: executor, health-check
>
> The loglevel of the health check in src/checks/checker_process.cpp (line 971) 
> is hardcoded to INFO, it would be nice if this is configurable



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-6568) JSON serialization should not omit empty arrays in HTTP APIs

2020-04-22 Thread Dominik Dary (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089641#comment-17089641
 ] 

Dominik Dary commented on MESOS-6568:
-

[~bmahler] this is done, right?

> JSON serialization should not omit empty arrays in HTTP APIs
> 
>
> Key: MESOS-6568
> URL: https://issues.apache.org/jira/browse/MESOS-6568
> Project: Mesos
>  Issue Type: Improvement
>  Components: HTTP API
>Reporter: Neil Conway
>Assignee: Benjamin Mahler
>Priority: Major
>  Labels: mesosphere
>
> When using the JSON content type with the HTTP APIs, a {{repeated}} protobuf 
> field is omitted entirely from the JSON serialization of the message. For 
> example, this is a response to the {{GetTasks}} call:
> {noformat}
> {
>   "get_tasks": {
> "tasks": [{...}]
>   },
>   "type": "GET_TASKS"
> }
> {noformat}
> I think it would be better to include empty arrays for the other fields of 
> the message ({{pending_tasks}}, {{completed_tasks}}, etc.). Advantages:
> # Consistency with the old HTTP endpoints, e.g., /state
> # Semantically, an empty array is more accurate. The master's response should 
> be interpreted as saying it doesn't know about any pending/completed tasks; 
> that is more accurately conveyed by explicitly including an empty array, not 
> by omitting the key entirely.
> *NOTE: The 
> [asV1Protobuf|https://github.com/apache/mesos/blob/d10a33acc426dda9e34db995f16450faf898bb3b/src/common/http.cpp#L172-L423]
>  copy needs to also be updated.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (MESOS-10054) Update Docker containerizer to set Docker container’s resource limits and `oom_score_adj`

2020-04-22 Thread Qian Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087845#comment-17087845
 ] 

Qian Zhang edited comment on MESOS-10054 at 4/22/20, 8:32 AM:
--

RR: [https://reviews.apache.org/r/72391/]


was (Author: qianzhang):
RR: 

[https://reviews.apache.org/r/72401/]

[https://reviews.apache.org/r/72391/]

> Update Docker containerizer to set Docker container’s resource limits and 
> `oom_score_adj`
> -
>
> Key: MESOS-10054
> URL: https://issues.apache.org/jira/browse/MESOS-10054
> Project: Mesos
>  Issue Type: Task
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Major
>
> This is to set resource limits for executor which will run as a Docker 
> container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (MESOS-8877) Docker container's resources will be wrongly enlarged in cgroups after agent recovery

2020-04-22 Thread Qian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/MESOS-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Zhang reassigned MESOS-8877:
-

Story Points: 3  (was: 5)
Assignee: Qian Zhang

RR: [https://reviews.apache.org/r/72401/]

> Docker container's resources will be wrongly enlarged in cgroups after agent 
> recovery
> -
>
> Key: MESOS-8877
> URL: https://issues.apache.org/jira/browse/MESOS-8877
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 1.6.1, 1.6.0, 1.5.1, 1.5.0, 1.4.2, 1.4.1, 1.4.0
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: containerization
>
> Reproduce steps:
> 1. Run `mesos-execute --master=10.0.49.2:5050 
> --task=[file:///home/qzhang/workspace/config/task_docker.json] 
> --checkpoint=true` to launch a Docker container.
> {code:java}
> # cat task_docker.json 
> {
>   "name": "test",
>   "task_id": {"value" : "test"},
>   "agent_id": {"value" : ""},
>   "resources": [
> {"name": "cpus", "type": "SCALAR", "scalar": {"value": 0.1}},
> {"name": "mem", "type": "SCALAR", "scalar": {"value": 32}}
>   ],
>   "command": {
> "value": "sleep 5"
>   },
>   "container": {
> "type": "DOCKER",
> "docker": {
>   "image": "alpine"
> }
>   }
> }
> {code}
> 2. When the Docker container is running, we can see its resources in cgroups 
> are correctly set, so far so good.
> {code:java}
> # cat 
> /sys/fs/cgroup/cpu,cpuacct/docker/a711b3c7b0d91cd6d1c7d8daf45a90ff78d2fd66973e615faca55a717ec6b106/cpu.cfs_quota_us
>  
> 1
> # cat 
> /sys/fs/cgroup/memory/docker/a711b3c7b0d91cd6d1c7d8daf45a90ff78d2fd66973e615faca55a717ec6b106/memory.limit_in_bytes
>  
> 33554432
> {code}
> 3. Restart Mesos agent, and then we will see the resources of the Docker 
> container will be wrongly enlarged.
> {code}
> I0503 02:06:17.268340 29512 docker.cpp:1855] Updated 'cpu.shares' to 204 at 
> /sys/fs/cgroup/cpu,cpuacct/docker/a711b3c7b0d91cd6d1c7d8daf45a90ff78d2fd66973e615faca55a717ec6b106
>  for container 1b21295b-2f49-4d08-84c7-43b9ae15ad88
> I0503 02:06:17.271390 29512 docker.cpp:1882] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 20ms (cpus 0.2) for container 
> 1b21295b-2f49-4d08-84c7-43b9ae15ad88
> I0503 02:06:17.273082 29512 docker.cpp:1924] Updated 
> 'memory.soft_limit_in_bytes' to 64MB for container 
> 1b21295b-2f49-4d08-84c7-43b9ae15ad88
> I0503 02:06:17.275908 29512 docker.cpp:1950] Updated 'memory.limit_in_bytes' 
> to 64MB at 
> /sys/fs/cgroup/memory/docker/a711b3c7b0d91cd6d1c7d8daf45a90ff78d2fd66973e615faca55a717ec6b106
>  for container 1b21295b-2f49-4d08-84c7-43b9ae15ad88
> # cat 
> /sys/fs/cgroup/cpu,cpuacct/docker/a711b3c7b0d91cd6d1c7d8daf45a90ff78d2fd66973e615faca55a717ec6b106/cpu.cfs_quota_us
> 2
> # cat 
> /sys/fs/cgroup/memory/docker/a711b3c7b0d91cd6d1c7d8daf45a90ff78d2fd66973e615faca55a717ec6b106/memory.limit_in_bytes
> 67108864
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (MESOS-10117) Update the `usage()` method of containerizer to set resource limits in the `ResourceStatistics` protobuf message

2020-04-22 Thread Qian Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088682#comment-17088682
 ] 

Qian Zhang edited comment on MESOS-10117 at 4/22/20, 8:15 AM:
--

RR:

[https://reviews.apache.org/r/72398/]

[https://reviews.apache.org/r/72399/]

[https://reviews.apache.org/r/72402/]


was (Author: qianzhang):
RR:

[https://reviews.apache.org/r/72398/]

[https://reviews.apache.org/r/72399/]

[https://reviews.apache.org/r/72400/]

[https://reviews.apache.org/r/72402/]

> Update the `usage()` method of containerizer to set resource limits in the 
> `ResourceStatistics` protobuf message
> 
>
> Key: MESOS-10117
> URL: https://issues.apache.org/jira/browse/MESOS-10117
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Major
>
> In the `ResourceStatistics` protobuf message, there are a couple of issues:
>  # There are already `cpu_limit` and `mem_limit_bytes` fields, but they are 
> actually CPU & memory requests when resources limits are specified for a task.
>  # There is already `mem_soft_limit_bytes` field, but this field seems not 
> set anywhere.
> So we need to update this protobuf message and also the related containerizer 
> code which set the fields of this protobuf message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)