[jira] [Commented] (MESOS-10127) The sequences used in Docker volume isolator are never erased

2020-05-18 Thread Qian Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110321#comment-17110321
 ] 

Qian Zhang commented on MESOS-10127:


It seems there is no a proper place in Docker volume isolator's code to erase 
the sequence.

If we erase the sequence after the unmount operation is invoked (like right 
after [this 
line|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp#L658]),
 and another container tries to use the same volume at the same time (so we 
need to mount the volume), then unmount and mount operations could happen 
simultaneously for the same volume which is just what we want to avoid by using 
the sequence.

If we erase the sequence after the unmount operation is complete (like in 
[DockerVolumeIsolatorProcess::_cleanup()|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp#L670]),
 and another container tries to use the same volume before the sequence is 
erased but after the unmount operation is complete, then we could erase the 
sequence when the mount operation is still ongoing which may cause the mount 
operation is discarded.

> The sequences used in Docker volume isolator are never erased
> -
>
> Key: MESOS-10127
> URL: https://issues.apache.org/jira/browse/MESOS-10127
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Qian Zhang
>Priority: Major
>
> In Docker volume isolator, we use 
> [sequence|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.hpp#L119:L122]
>  to make sure the mount and unmount operations for a single volume are issued 
> serially, but the sequence is never erased which could be a memory leak.
> We have this issue since Mesos 1.0.0 release when Docker volume isolator was 
> introduced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (MESOS-10091) CI builds on ubuntu 14.04 fail to create Java bindings

2020-05-18 Thread Benjamin Bannier (Jira)


 [ 
https://issues.apache.org/jira/browse/MESOS-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-10091:


Assignee: (was: Benjamin Bannier)

> CI builds on ubuntu 14.04 fail to create Java bindings
> --
>
> Key: MESOS-10091
> URL: https://issues.apache.org/jira/browse/MESOS-10091
> Project: Mesos
>  Issue Type: Bug
>  Components: java api, reviewbot
>Reporter: Benjamin Bannier
>Priority: Major
>
> Builds with Java bindings enabled fail on ubuntu-14.04 (this e.g., includes 
> reviewbot builds) with the following error
> {noformat}
> 22:28:09 Building mesos-1.10.0.jar ...
> 22:28:09 /bin/sed -i.bak 's/mesos\.mesos_pb2/mesos_pb2/' 
> python/interface/src/mesos/v1/interface/scheduler_pb2.py && rm 
> python/interface/src/mesos/v1/interface/scheduler_pb2.py.bak
> 22:28:15 [ERROR] The build could not read 1 project -> [Help 1]
> 22:28:15 [ERROR]   
> 22:28:15 [ERROR]   The project org.apache.mesos:mesos:1.10.0 
> (/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mesos-ec2-ubuntu-14.04/mesos/build/src/java/mesos.pom)
>  has 1 error
> 22:28:15 [ERROR] Non-resolvable parent POM: Could not transfer artifact 
> org.apache:apache:pom:11 from/to central 
> (http://repo.maven.apache.org/maven2): Failed to transfer file: 
> http://repo.maven.apache.org/maven2/org/apache/apache/11/apache-11.pom. 
> Return code is: 501 , ReasonPhrase:HTTPS Required. and 'parent.relativePath' 
> points at wrong local POM @ line 18, column 11 -> [Help 2]
> 22:28:15 [ERROR] 
> 22:28:15 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 22:28:15 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 22:28:15 [ERROR] 
> 22:28:15 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 22:28:15 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
> 22:28:15 [ERROR] [Help 2] 
> http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException
> 22:28:15 make[1]: *** [java/target/mesos-1.10.0.jar] Error 1
> 22:28:15 make[1]: Leaving directory 
> `/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mesos-ec2-ubuntu-14.04/mesos/build/src'
> 22:28:15 make: *** [all-recursive] Error 1
> {noformat}
> The error seems to be due to the maven version we use in ubuntu-14.04 CI 
> images not using HTTPS by default [which seems required since 
> 2020-01-15|https://support.sonatype.com/hc/en-us/articles/360041287334]
> {quote}
> Question
> As of January 15, 2020 I am receiving the following responses upon making 
> requests to The Central Repository:
> Requests to http://repo1.maven.org/maven2/ return a 501 HTTPS Required status 
> and a body:
> 501 HTTPS Required. 
> Use https://repo1.maven.org/maven2/
> More information at https://links.sonatype.com/central/501-https-required
> Requests to http://repo.maven.apache.org/maven2/ return a 501 HTTPS Required 
> status and a body:
> 501 HTTPS Required. 
> Use https://repo.maven.apache.org/maven2/
> More information at https://links.sonatype.com/central/501-https-required
> How do I satisfy this requirement so that I can regain access to Central?
> Answer
> Effective January 15, 2020, The Central Repository no longer supports 
> insecure communication over plain HTTP and requires that all requests to the 
> repository are encrypted over HTTPS.
> If you're receiving this error, then you need to replace all URL references 
> to Maven Central with their canonical HTTPS counterparts:
> Replace http://repo1.maven.org/maven2/ with https://repo1.maven.org/maven2/
> Replace http://repo.maven.apache.org/maven2/ with 
> https://repo.maven.apache.org/maven2/
> If for any reason your environment cannot support HTTPS, you have the option 
> of using our dedicated insecure endpoint at 
> http://insecure.repo1.maven.org/maven2/
> For further context around the move to HTTPS, please see 
> https://blog.sonatype.com/central-repository-moving-to-https.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (MESOS-8464) Automate building of mesos-tidy docker image

2020-05-18 Thread Benjamin Bannier (Jira)


 [ 
https://issues.apache.org/jira/browse/MESOS-8464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-8464:
---

Assignee: (was: Benjamin Bannier)

> Automate building of mesos-tidy docker image
> 
>
> Key: MESOS-8464
> URL: https://issues.apache.org/jira/browse/MESOS-8464
> Project: Mesos
>  Issue Type: Bug
>Reporter: Benjamin Bannier
>Priority: Major
>
> The script {{support/mesos-tidy.sh}} relies on the docker image 
> {{mesos/mesos-tidy}}. This imagine is currently manually built from the files 
> in {{support/mesos-tidy}} and then uploaded to dockerhub.The manual step 
> creates unnecessary friction to roll out updates to the mesos-tidy setup; 
> while e.g., every committer can update the setup, not all committers are able 
> to update the image.
> We should investigate how to automate creating this image whenever source 
> files under {{support/mesos-tidy/}} are updated. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (MESOS-8400) Handle plugin crashes gracefully in SLRP recovery.

2020-05-18 Thread Benjamin Bannier (Jira)


 [ 
https://issues.apache.org/jira/browse/MESOS-8400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-8400:
---

Assignee: (was: Benjamin Bannier)

> Handle plugin crashes gracefully in SLRP recovery.
> --
>
> Key: MESOS-8400
> URL: https://issues.apache.org/jira/browse/MESOS-8400
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Chun-Hung Hsiao
>Priority: Blocker
>  Labels: mesosphere, mesosphere-dss-post-ga, storage
>
> When a CSI plugin crashes, the container daemon in SLRP will reset its 
> corresponding {{csi::Client}} service future. However, if a CSI call races 
> with a plugin crash, the call may be issued before the service future is 
> reset, resulting in a failure for that CSI call. MESOS-9517 partly addresses 
> this for {{CreateVolume}} and {{DeleteVolume}} calls, but calls in the SLRP 
> recovery path, e.g., {{ListVolume}}, {{GetCapacity}}, {{Probe}}, could make 
> the SLRP unrecoverable.
> There are two main issues:
>  1. For {{Probe}}, we should investigate if it is needed to make a few retry 
> attempts, then after that, we should recover from failed attempts (e.g., kill 
> the plugin container), then make the container daemon relaunch the plugin 
> instead of failing the daemon.
> 2. For other calls in the recovery path, we should either retry the call, or 
> make the local resource provider daemon be able to restart the SLRP after it 
> fails.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-9630) Consider moving linter setup to pre-commit

2020-05-18 Thread Benjamin Bannier (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110274#comment-17110274
 ] 

Benjamin Bannier commented on MESOS-9630:
-

{noformat}
commit 83359534cb1b3303fcbae34af3fadd81b7c8cb85
Author: Benjamin Bannier bbann...@apache.org
Date:   Wed May 6 17:47:37 2020 +0200

Removed mesos-style transition script.
Review: https://reviews.apache.org/r/71300/
{noformat}

> Consider moving linter setup to pre-commit
> --
>
> Key: MESOS-9630
> URL: https://issues.apache.org/jira/browse/MESOS-9630
> Project: Mesos
>  Issue Type: Wish
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Minor
> Fix For: 1.10.0
>
>
> Mesos currently uses a mix of hand-crafted git commit hooks and mesos-style 
> to perform linting. While this has served us well our current approach also 
> has some drawbacks, e.g.,
>  * the linter setup is spread between hooks and {{support/mesos-style.py}}
>  * adding new linters can be cumbersome
>  * mesos-style.py uses a process where it creates a single virtualenv to 
> install linters in which is tie d to the source tree
>  * linter dependencies are only cached to an extent and it is easy to run 
> into a situation where one needs to update linter dependencies over the 
> network even though one has successfully linted a revision before
>  * {{support/mesos-style.py}} lacks a number of features, e.g., running over 
> only staged files, running linters in parallel for improved throughput, 
> running only specific linters or disabling certain linters, and the 
> parameterization of the linters is strongly coupled to implementation of the 
> style checker itself.
> The [pre-commit tool|https://pre-commit.com/] solves most of these issues and 
> using it in Mesos would not only allow us to get rid of tooling which is hard 
> to maintain, but also unlock other features. It is licensed under a MIT 
> license. We should consider moving our linting setup over to pre-commit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-5011) Support OCI image spec.

2020-05-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/MESOS-5011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110040#comment-17110040
 ] 

Stéphane Cottin commented on MESOS-5011:


FYI, Sonatype Nexus supports OCI registry format since version 3.23.0

https://issues.sonatype.org/browse/NEXUS-21087

> Support OCI image spec.
> ---
>
> Key: MESOS-5011
> URL: https://issues.apache.org/jira/browse/MESOS-5011
> Project: Mesos
>  Issue Type: Epic
>Reporter: Guangya Liu
>Assignee: Qian Zhang
>Priority: Major
>
> OCI image spec is approaching 1.0, we should add support to the unified 
> containerizer so that users can launch OCI images using unified containerizer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (MESOS-10130) Docker Manifest list support

2020-05-18 Thread Jira
Stéphane Cottin created MESOS-10130:
---

 Summary: Docker Manifest list support
 Key: MESOS-10130
 URL: https://issues.apache.org/jira/browse/MESOS-10130
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: Stéphane Cottin


Sonatype Nexus 3.22+, and probably other docker registry solutions, now serves 
manifest lists.

[https://issues.sonatype.org/browse/NEXUS-18546]

Apache Mesos does not support yet this part of the Image Manifest V2S2 spec.

https://docs.docker.com/registry/spec/manifest-v2-2/#manifest-list

This is not a critical issue as Sonatype Nexus is not a dependency of Apache 
Mesos, but as we cannot use Nexus > 3.21.2, this leads to side security issues.

[https://support.sonatype.com/hc/en-us/articles/360046233714]

Apache Mesos should support the whole Image Manifest V2S2 specification.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-7884) Support containerd on Mesos.

2020-05-18 Thread Gilbert Song (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109937#comment-17109937
 ] 

Gilbert Song commented on MESOS-7884:
-

cc [~qianzhang]

> Support containerd on Mesos.
> 
>
> Key: MESOS-7884
> URL: https://issues.apache.org/jira/browse/MESOS-7884
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization
>Reporter: Gilbert Song
>Priority: Major
>  Labels: containerd, containerizer
>
> containerd v1.0 is very close (v1.0.0 alpha 4 now) to the formal release. We 
> should consider support containerd on Mesos, either by refactoring the docker 
> containerizer or introduce a new containerd containerizer. Design and 
> suggestions are definitely welcome.
> https://github.com/containerd/containerd



--
This message was sent by Atlassian Jira
(v8.3.4#803005)