[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-22 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021545#comment-17021545
 ] 

Eric Yang commented on YARN-9292:
-

>From today's YARN Docker community meeting, we have decided to abandon this 
>patch.  There is possibilities that AM can fail over a node which has 
>different latest tag than previous node.  The frame of reference to latest tag 
>is relative to the node where AM is running.  If there are inconsistency in 
>the cluster, this patch will not solve the consistency problem.  Newly spawned 
>AM will use a different sha id that maps to latest tag, which leads to 
>inconsistent sha id used by the same application.

The ideal design is to have YARN client to discover the latest tag is 
referencing, then populate that information to rest of the job.  Unfortunately, 
there is no connection between YARN and where docker registry might be running. 
 Hence, it is not possible to implement this proper for YARN and Docker 
integration.  The community settle on document this wrinkle and try to avoid 
using latest tag as best practice.

For Runc container, it will be possible to use HDFS as source of truth to look 
up the global hash designation for runc container.  YARN client can query HDFS 
for the latest tag and it will be consistent on all nodes.  This will add some 
extra protocol interactions between YARN client and RM to solve this problem by 
the ideal design.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch, YARN-9292.007.patch, YARN-9292.008.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-16 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017364#comment-17017364
 ] 

Eric Yang commented on YARN-9292:
-

[~ebadger] {quote}I do have some questions on why we can't move the AM into a 
docker container though. What is it that is special about the AM that we need 
to run it directly on the host? What does it depend on the host for? We should 
be able to use the distributed cache to localize any libraries/jars that it 
needs. And as far as nscd/sssd, those can be bind-mounted into the container 
via configs. If they don't have nscd/sssd then they can bind-mount /etc/passwd. 
Since they would've been using the host anyway, this is no different.{quote}

YARN native service was a code merge from Apache Slider, and it was developed 
to run in YARN container directory like mapreduce tasks.  If the AM docker 
image is a mirror image of the host system, AM can run in a docker container.  
AM code still depends on all Hadoop client libraries, Hadoop configuration and 
Hadoop environment variables.

{quote}As far as the docker image itself, why does Hadoop need to provide an 
image? Everything needed can be provided via the distributed cache or 
bind-mounts, right? I don't see why we need a specialized image that is tied to 
Hadoop. You just need an image with Java and Bash.{quote}

>From 10,000 feet point of view, yes, AM only requires Java and Bash.  If 
>Hadoop provides the image, our users can deploy the image without worry about 
>how to create a docker image that mirrors the host structure.  Without Hadoop 
>supplying image and agreed upon image format.  It is up to the system admin's 
>interpretation of where Hadoop client configuration and client binaries are 
>located.  He/she can run the job with ENTRY point mode disabled and bind mount 
>Hadoop configuration and binaries.  As I recall, this is the less secure 
>approach to run the container because container requires to bind mount 
>writable Hadoop log directory to the container for launcher script to write 
>output.  This is a hassle and no container benefit. This method still exposes 
>host level environment and binaries to container.  There are 5 people on 
>planet Earth that knows how to wire this together, but unlikely to suggest 
>this approach.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch, YARN-9292.007.patch, YARN-9292.008.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-16 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017312#comment-17017312
 ] 

Eric Badger commented on YARN-9292:
---

bq. Very good question, and the answer is somewhat complicated. For AM to run 
in the docker container, AM must have identical Hadoop client bits (Java, 
Hadoop, etc), and credential mapping (nscd/sssd). Many of those pieces can not 
be moved cleanly into Docker container in the first implementation of YARN 
native service (LLAP/Slider alike projects) because resistance of building 
agreeable docker image as part of Hadoop project. AM remains as outside of 
docker container for simplicity.

So I read your last comment and I think that everything pretty much makes sense 
if we can fix the issue of the AM not running in a Docker container. That way 
we can use YARN-9184 to pull the image and get the most up to date sha for the 
entire job to run with. And if an admin wants to do the image management 
themselves then they don't enable YARN-9184 and are responsible to have the 
images on the cluster that they want there. At that point, any errors would be 
for them to fix through their own automation.

I do have some questions on why we can't move the AM into a docker container 
though. What is it that is special about the AM that we need to run it directly 
on the host? What does it depend on the host for? We should be able to use the 
distributed cache to localize any libraries/jars that it needs. And as far as 
nscd/sssd, those can be bind-mounted into the container via configs. If they 
don't have nscd/sssd then they can bind-mount /etc/passwd. Since they would've 
been using the host anyway, this is no different. 

As far as the docker image itself, why does Hadoop need to provide an image? 
Everything needed can be provided via the distributed cache or bind-mounts, 
right? I don't see why we need a specialized image that is tied to Hadoop. You 
just need an image with Java and Bash.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch, YARN-9292.007.patch, YARN-9292.008.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-13 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014687#comment-17014687
 ] 

Eric Yang commented on YARN-9292:
-

[~ebadger] {quote}the image wouldn't have been pulled to that node before the 
task is run, right? That's my concern here.{quote}

The concern for inconsistent docker image spread on the cluster is a valid one. 
 There are two possibilities.  Docker image exist on AM node, or it doesn't.  
# In the case where image exist on AM node, launching docker image using sha 
from AM node will result in a warning or failure.  This depends on application 
anti-affinity policy.  The error message of failed to launch docker container 
using sha signature should provide some clues to administrators to fix the 
docker images on other nodes.  
# If it is requesting an image that doesn't exist on the AM node, it will 
proceed with latest tag.  It will have consistent images used if YARN-9184 is 
enabled.  If YARN-9184 is turned off, it will follow the same pattern as 1.

{quote}The command you ran doesn't even work for my version of Docker.{quote}

I think my mouse cursor jumped when I copy and paste the information.  I 
couldn't find where it changed the output.  Your syntax is the correct one to 
use.  Sorry for the confusion.

{quote}Reading around on the internet, it looks like Docker takes the manifest 
sha and then recalculates the digest with some other stuff added on (maybe the 
tag data?) to get a new digest. I'm worried that this could break if we 
randomly choose the last sha. For example, maybe centos:7 is installed 
everywhere, but centos:latest is only installed on this one node by accident. 
If we grab the centos:latest sha, it won't work on the rest of the nodes in the 
cluster because the sha won't match the tag of the image on those nodes, even 
though they have the same manifest hash. Or maybe it only does the check based 
on the manifest hash. I can't seem to reproduce this with my version of Docker, 
so I can't test out what actually happens.{quote}

When the list become multiple, they are pointed to the same image, just the 
repository id is different.  At this time, using any of the repo digest id have 
the same out come.  This was tested carefully before I go ahead with the 
implementation.

This patch will impact the most when system admin does not use docker registry 
to manage docker images, and have inconsistent docker latest images sitting on 
nodes.  They may get some extra nudge on launching application with 
inconsistent images with anti-affinity policy defined.  Majority of users are 
not affected by this change.  If AM picks an older image than latest on docker 
registry,  the application docker images remain uniform.  There is a 
possibility to have more of the same containers end up on the same node.  
However, this should be fine when user does not specify placement policy rules.

I think this problem has been dissected to as small piece as possible, I 
haven't came up with more elegant solution to keep docker image consistent with 
latest tag and support both docker registry and without.  Let me know if there 
is new ideas coming to mind.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch, YARN-9292.007.patch, YARN-9292.008.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If 

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-13 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014600#comment-17014600
 ] 

Eric Badger commented on YARN-9292:
---

{quote}
The node might have an old Docker image on it. It would be nice to get the 
image information from the registry and only fall back to the local node's 
version if the registry lookup fails. An indirect way to do this would be to do 
a docker pull} before calling {{docker images.

The same can be argued for people who do not want to automatic pulling of 
docker image to latest. As the result, there is a flag implemented in 
YARN-9184. The flag decides if it will be base on local latest or repository 
latest. This change should work in combination with YARN-9184.
{quote}
Agreed on your point with YARN-9184. I thought of that as well, but since the 
AM isn't running inside of a Docker container, the image wouldn't have been 
pulled to that node before the task is run, right? That's my concern here.

{quote}
If we can hit the docker registry directly via its REST API then we won't need 
to invoke the container-executor at all and we can avoid this problem. This 
looks like it should be fairly trivial, but I don't know how much more 
difficult secure registries would be.

We don't contact docker registry directly nor we have code to conect secure 
docker registry. I think it is too risky to contact the registry directly 
because the registry could be a private registry defined in user's docker 
config.json. It would be going down a rabbit hole to follow this path.
{quote}
I imagined that would be the answer here. Fair enough.

{quote}
By using either hash is fine, they will result in the same image. It is 
somewhat fuzzy because they are alias of one another.
{quote}

{noformat}
[ebadger@foo bin]$ sudo docker images | grep centos
docker.io/centos 7  
 5e35e350aded2 months ago203 MB
docker.io/centos latest 
 0f3e07c0138f3 months ago220 MB
[ebadger@foo bin]$ sudo docker inspect image centos -f "{{.RepoDigests}}"

Error: No such object: image
{noformat}
Docker must have changed a bunch since the last supported release from RedHat 
in RHEL 7 (1.13). The command you ran doesn't even work for my version of 
Docker.

{noformat}
[ebadger@foo bin]$ sudo docker images | grep centos
docker.io/centos 7  
 5e35e350aded2 months ago203 MB
docker.io/centos latest 
 0f3e07c0138f3 months ago220 MB
[ebadger@foo bin]$ sudo docker image inspect centos -f "{{.RepoDigests}}"
[docker.io/centos@sha256:f94c1d992c193b3dc09e297ffd54d8a4f1dc946c37cbeceb26d35ce1647f88d9]
{noformat}
If I switch {{inspect image}} to {{image inspect}} I get a similar output to 
yours, but I only get a single sha. Reading around on the internet, it looks 
like Docker takes the manifest sha and then recalculates the digest with some 
other stuff added on (maybe the tag data?) to get a new digest. I'm worried 
that this could break if we randomly choose the last sha. For example, maybe 
centos:7 is installed everywhere, but centos:latest is only installed on this 
one node by accident. If we grab the centos:latest sha, it won't work on the 
rest of the nodes in the cluster because the sha won't match the tag of the 
image on those nodes, even though they have the same manifest hash. Or maybe it 
only does the check based on the manifest hash. I can't seem to reproduce this 
with my version of Docker, so I can't test out what actually happens. 

{quote}
Maybe need to upgrade the docker version. The output appears like this on my 
system:
{quote}
Yea I'm not sure how to deal with this. Docker seems to have broken things (or 
added new things). I know Docker is a fast-moving technology, but RHEL 7 is 
basically stuck on 1.13.1 at this point because of licensing issues.

{quote}
No, this rest api is secured by SPNEGO authentication, same as rest of node 
manager rest api. HttpUtil.connect handles Kerberos negotiation.
{quote}
Ok cool. Thanks for the explanation. 

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch, YARN-9292.007.patch, YARN-9292.008.patch
>
>
> Docker 

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-10 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013212#comment-17013212
 ] 

Eric Yang commented on YARN-9292:
-

Patch 008 fixes space issues suggested by [~ebadger].

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch, YARN-9292.007.patch, YARN-9292.008.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-10 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013200#comment-17013200
 ] 

Eric Yang commented on YARN-9292:
-

{quote}I see. Why isn't the AM run inside of a Docker container?{quote}

Very good question, and the answer is somewhat complicated.  For AM to run in 
the docker container, AM must have identical Hadoop client bits (Java, Hadoop, 
etc), and credential mapping (nscd/sssd).  Many of those pieces can not be 
moved cleanly into Docker container in the first implementation of YARN native 
service (LLAP/Slider alike projects) because resistance of building agreeable 
docker image as part of Hadoop project.  AM remains as outside of docker 
container for simplicity.

{quote}The node might have an old Docker image on it. It would be nice to get 
the image information from the registry and only fall back to the local node's 
version if the registry lookup fails. An indirect way to do this would be to do 
a docker pull} before calling {{docker images.{quote}

The same can be argued for people who do not want to automatic pulling of 
docker image to latest.  As the result, there is a flag implemented in 
YARN-9184.  The flag decides if it will be base on local latest or repository 
latest.  This change should work in combination with YARN-9184.

{quote}If we can hit the docker registry directly via its REST API then we 
won't need to invoke the container-executor at all and we can avoid this 
problem. This looks like it should be fairly trivial, but I don't know how much 
more difficult secure registries would be.{quote}

We don't contact docker registry directly nor we have code to conect secure 
docker registry.  I think it is too risky to contact the registry directly 
because the registry could be a private registry defined in user's docker 
config.json.  It would be going down a rabbit hole to follow this path.

{quote}Do you have documentation handy for docker image inspect that talks 
about the fuzzy matching?{quote}

Sorry, don't have any, but here is a example that you can try locally:

{code}$ docker images|grep centos
centos 7 9f38484d220f10 months 
ago   202MB
centos latest9f38484d220f10 months 
ago   202MB{code}

Suppose that you have used both centos:7 and centos:latest tags, and that are 
both pointed to the same image.  The result of the repository digest hash for 
both images produces different hash:

{code}$ docker inspect image centos -f "{{.RepoDigests}}"
[centos@sha256:a799dd8a2ded4a83484bbae769d97655392b3f86533ceb7dd96bbac929809f3c 
centos@sha256:b5e66c4651870a1ad435cd75922fe2cb943c9e973a9673822d1414824a1d0475]{code}

By using either hash is fine, they will result in the same image.  It is 
somewhat fuzzy because they are alias of one another.

{quote}A space before and after != and ==. If the purpose of omitting the 
spaces is to show operation bundling, then I would just add () around the two 
separate comparisons around the &&{quote}

I see, will update according.  Thanks

{quote}Thanks for clearing up the quoting issue. But I'm still getting what 
appears to be a less than ideal result. Is this expected behavior?{quote}

Maybe need to upgrade the docker version.  The output appears like this on my 
system:

{code}$ docker images --format="{{json .}}" --filter="dangling=false"
{"Containers":"N/A","CreatedAt":"2019-11-11 11:23:08 -0500 
EST","CreatedSince":"2 months 
ago","Digest":"\u003cnone\u003e","ID":"7317640d555e","Repository":"prom/prometheus","SharedSize":"N/A","Size":"130MB","Tag":"latest","UniqueSize":"N/A","VirtualSize":"130.2MB"}
{"Containers":"N/A","CreatedAt":"2019-07-15 16:14:12 -0400 
EDT","CreatedSince":"5 months 
ago","Digest":"\u003cnone\u003e","ID":"771e0613a264","Repository":"ozonesecure_kdc","SharedSize":"N/A","Size":"127MB","Tag":"latest","UniqueSize":"N/A","VirtualSize":"127.4MB"}
{"Containers":"N/A","CreatedAt":"2019-07-15 00:04:39 -0400 
EDT","CreatedSince":"5 months 
ago","Digest":"\u003cnone\u003e","ID":"48b0eebc96f0","Repository":"jaegertracing/all-in-one","SharedSize":"N/A","Size":"48.7MB","Tag":"latest","UniqueSize":"N/A","VirtualSize":"48.71MB"}
{"Containers":"N/A","CreatedAt":"2019-07-02 14:56:10 -0400 
EDT","CreatedSince":"6 months 
ago","Digest":"\u003cnone\u003e","ID":"f38d9c7e49be","Repository":"flokkr/hadoop","SharedSize":"N/A","Size":"503MB","Tag":"2.7.7","UniqueSize":"N/A","VirtualSize":"503.3MB"}
{"Containers":"N/A","CreatedAt":"2019-06-25 14:27:08 -0400 
EDT","CreatedSince":"6 months 
ago","Digest":"\u003cnone\u003e","ID":"c912f3f026ed","Repository":"grafana/grafana","SharedSize":"N/A","Size":"249MB","Tag":"latest","UniqueSize":"N/A","VirtualSize":"248.5MB"}
{"Containers":"N/A","CreatedAt":"2019-06-24 19:37:36 -0400 
EDT","CreatedSince":"6 months 

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-10 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013167#comment-17013167
 ] 

Eric Badger commented on YARN-9292:
---

{quote}
Doesn't the container know what image it was started with in its environment?

ServiceScheduler runs as part of application master for YARN service. YARN AM 
is not containerized. Docker command to resolve image digest id happens prior 
to any docker container is launched. The lookup for the docker image is from 
the node which AM is running. We use the sha256 digest from AM node as 
authoritative signature to give the application equal chance of acquiring 
docker digest id on any node manager.
{quote}
I see. Why isn't the AM run inside of a Docker container?

The node might have an old Docker image on it. It would be nice to get the 
image information from the registry and only fall back to the local node's 
version if the registry lookup fails. An indirect way to do this would be to do 
a {{docker pull} before calling {{docker images}}. 

{quote}
If we don't care about the container and just want to know what the sha of the 
image:tag is, then I agree with Chandni Singh that we don't need to use the 
containerId.

Container ID is used by container executor to properly permission the working 
directory, generate .cmd file for container-executor binary, and all output and 
exit code are stored to the container id directory. Without container ID, we 
will need to craft a complete separate path to acquire privileges to launch 
docker commands, which is extra code duplication and not follow the security 
practice that was baked in place to prevent parameter hijacking. I choose to 
follow the existing process to avoid code bloat.
{quote}
If we can hit the docker registry directly via its REST API then we won't need 
to invoke the container-executor at all and we can avoid this problem. This 
looks like it should be fairly trivial, but I don't know how much more 
difficult secure registries would be. 

{quote}
But if there are many, couldn't that not be the correct one?

The output given from docker image [image-id] -f ".RepoDigests" may contain 
similar names, like local/centos and centos at the same time due to fuzzy 
matching. For loop matches the exact name instead of prefix matching. Hence, it 
is always the correct one that is matched.
{quote}
Do you have documentation handy for {{docker image inspect}} that talks about 
the fuzzy matching? The [docker 
documentation|https://docs.docker.com/engine/reference/commandline/image_inspect/]
 is pretty underwhelming. From my own testing, searching for a substring of an 
image doesn't match.

{quote}
I think we should import this instead of including the full path

Sorry, can't do. There is another import that reference to 
org.apache.hadoop.yarn.service.component.Component, which prevent use of the 
same name.
{quote}
Ah yes. I figured there was a reason we didn't do this. Thanks for explaining. 

{quote}
Spacing issues on the operators.

Checkstyle did not find spacing issue with the existing patch, and the issue is 
not clear to me. Care to elaborate?
{quote}
{noformat}
+  if (compSpec.getArtifact()!=null && compSpec.getArtifact()
+  .getType()==TypeEnum.DOCKER) {
{noformat}
A space before and after {{!=}} and {{==}}. If the purpose of omitting the 
spaces is to show operation bundling, then I would just add {{()}} around the 
two separate comparisons around the {{&&}}

{quote}
The first part of both of these regexes is identical. I think we should create 
a subregex and append to it to avoid having to make changes in multiple places 
in the future. One if the image followed by a tag and the other is an image 
followed by a sha. Should be easy to do.

Sure, I will compact this to rebase this patch to trunk.
{quote}
Thanks

{quote}
The else clause syntax doesn't seem to work for me. Did I do something wrong?

Yes, unlike C exec, when running docker command on cli, it needs to be quoted 
to prevent shell expansion:

docker images --format="{{json .}}" --filter="dangling=flase"
. For clarity, we are using:

docker image [image-id] -f "{{.RepoDigests}}"
to find the real digest hash due to bugs with docker images output.
{quote}
Thanks for clearing up the quoting issue. But I'm still getting what appears to 
be a less than ideal result. Is this expected behavior?
{noformat}
[ebadger@foo ~]$ sudo docker images --format="{{json .}}" 
--filter=dangling=false
{}
{}
{}
{}
{}
{}
[ebadger@foo ~]$ docker -v
Docker version 1.13.1, build b2f74b2/1.13.1
{noformat}

{quote}
Another possible solution is to have the AM get the sha256 hash of the image 
that it is running in and then passing that sha to all of the containers that 
it starts. This would move the query into the Hadoop cluster itself.

I think the patch is implementing what you are suggesting that Hadoop query 
into the cluster itself via a node manager rest endpoint.

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013158#comment-17013158
 ] 

Hadoop QA commented on YARN-9292:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 26 unchanged - 3 fixed = 26 total (was 29) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
58s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
44s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}116m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-9292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12990545/YARN-9292.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 38acaac6c703 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c4fb43c |
| maven | version: 

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-10 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013075#comment-17013075
 ] 

Eric Yang commented on YARN-9292:
-

Patch 007 rebase to current trunk with DOCKER_IMAGE_REGEX fix suggested from 
[~ebadger].

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch, YARN-9292.007.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-09 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012300#comment-17012300
 ] 

Eric Yang commented on YARN-9292:
-

[~ebadger] Thanks for the review.  Here are my feedback:

{quote}Doesn't the container know what image it was started with in its 
environment?{quote}

ServiceScheduler runs as part of application master for YARN service.  YARN AM 
is not containerized.  Docker command to resolve image digest id happens prior 
to any docker container is launched.  The lookup for the docker image is from 
the node which AM is running.  We use the sha256 digest from AM node as 
authoritative signature to give the application equal chance of acquiring 
docker digest id on any node manager.

{quote}If we don't care about the container and just want to know what the sha 
of the image:tag is, then I agree with Chandni Singh that we don't need to use 
the containerId.{quote}

Container ID is used by container executor to properly permission the working 
directory, generate .cmd file for container-executor binary, and all output and 
exit code are stored to the container id directory.  Without container ID, we 
will need to craft a complete separate path to acquire privileges to launch 
docker commands, which is extra code duplication and not follow the security 
practice that was baked in place to prevent parameter hijacking.  I choose to 
follow the existing process to avoid code bloat.

{quote}But if there are many, couldn't that not be the correct one?{quote}

The output given from docker image [image-id] -f "{{.RepoDigests}}" may contain 
similar names, like local/centos and centos at the same time due to fuzzy 
matching.  For loop matches the exact name instead of prefix matching.  Hence, 
it is always the correct one that is matched.

{quote}I think we should import this instead of including the full path{quote}

Sorry, can't do.  There is another import that reference to 
org.apache.hadoop.yarn.service.component.Component, which prevent use of the 
same name.

{quote}Spacing issues on the operators.{quote}

Checkstyle did not find spacing issue with the existing patch, and the issue is 
not clear to me.  Care to elaborate?

{quote}The first part of both of these regexes is identical. I think we should 
create a subregex and append to it to avoid having to make changes in multiple 
places in the future. One if the image followed by a tag and the other is an 
image followed by a sha. Should be easy to do.{quote}

Sure, I will compact this to rebase this patch to trunk.

{quote}The else clause syntax doesn't seem to work for me. Did I do something 
wrong?{quote}

Yes, unlike C exec, when running docker command on cli, it needs to be quoted 
to prevent shell expansion:

{code}docker images --format="{{json .}}" --filter="dangling=flase"{code}.  For 
clarity, we are using:
{code}docker image [image-id] -f "{{.RepoDigests}}"{code} to find the real 
digest hash due to bugs with docker images output.

{quote}Another possible solution is to have the AM get the sha256 hash of the 
image that it is running in and then passing that sha to all of the containers 
that it starts. This would move the query into the Hadoop cluster itself.{quote}

I think the patch is implementing what you are suggesting that Hadoop query 
into the cluster itself via a node manager rest endpoint.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be 

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-09 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012185#comment-17012185
 ] 

Eric Badger commented on YARN-9292:
---

Hey [~eyang], thanks for the patch! It looks like this patch only applies to 
native services and that any client that wants to solve this issue will have to 
solve it themselves. I don't think we can get around this issue unless we want 
the RM to do the image sha256 hash query. And that sounds like a bad idea. But 
I think it makes sense to do this for native services at least. 

{noformat}
+
+  @GET
+  @Path("/container/{id}/docker/images/{name}")
+  @Produces({ MediaType.APPLICATION_JSON + "; " + JettyUtils.UTF_8,
+  MediaType.APPLICATION_XML + "; " + JettyUtils.UTF_8 })
+  public String getImageId(@PathParam("id") String id,
+  @PathParam("name") String name) {
+DockerImagesCommand dockerImagesCommand = new DockerImagesCommand();
+dockerImagesCommand = dockerImagesCommand.getSingleImageStatus(name);
+PrivilegedOperationExecutor privOpExecutor =
+PrivilegedOperationExecutor.getInstance(this.nmContext.getConf());
+try {
+  String output = DockerCommandExecutor.executeDockerCommand(
+  dockerImagesCommand, id, null, privOpExecutor, false, nmContext);
+  String[] ids = output.substring(1, output.length()-1).split(" ");
+  String result = name;
+  for (String image : ids) {
+String[] parts = image.split("@");
+if (parts[0].equals(name.substring(0, parts[0].length( {
+  result = image;
+}
+  }
+  return result;
+} catch (ContainerExecutionException e) {
+  return "latest";
+}
+  }
 }
{noformat}
Doesn't the container know what image it was started with in its environment? 
Why do we need to run a docker command here?  If we don't care about the 
container and just want to know what the sha of the image:tag is, then I agree 
with [~csingh] that we don't need to use the containerId. 

And if we do need to run a docker command, the for loop will give us the last 
sha256 associated with that image name. But if there are many, couldn't that 
not be the correct one?

{noformat}
+Collection
{noformat}
I think we should import this instead of including the full path

{noformat}
+  if (compSpec.getArtifact()!=null && compSpec.getArtifact()
+  .getType()==TypeEnum.DOCKER) {
{noformat}
Spacing issues on the operators.

{noformat}
+  public static final String DOCKER_IMAGE_REGEX =
   "^(([a-zA-Z0-9.-]+)(:\\d+)?/)?([a-z0-9_./-]+)(:[\\w.-]+)?$";

+  private static final String DOCKER_IMAGE_DIGEST_REGEX =
+  "^(([a-zA-Z0-9.-]+)(:\\d+)?/)?([a-z0-9_./-]+)(@sha256:)([a-f0-9]{6,64})";
{noformat}
The first part of both of these regexes is identical. I think we should create 
a subregex and append to it to avoid having to make changes in multiple places 
in the future. One if the image followed by a tag and the other is an image 
followed by a sha. Should be easy to do.


{noformat}
@@ -1771,11 +1779,29 @@ int get_docker_images_command(const char *command_file, 
const struct configurati
 if (ret != 0) {
   goto free_and_exit;
 }
+ret = add_to_args(args, "-f");
+if (ret != 0) {
+  goto free_and_exit;
+}
+ret = add_to_args(args, "{{.RepoDigests}}");
+if (ret != 0) {
+  goto free_and_exit;
+}
+  } else {
+ret = add_to_args(args, DOCKER_IMAGES_COMMAND);
+if (ret != 0) {
+  goto free_and_exit;
+}
+ret = add_to_args(args, "--format={{json .}}");
+if (ret != 0) {
+  goto free_and_exit;
+}
+ret = add_to_args(args, "--filter=dangling=false");
+if (ret != 0) {
+  goto free_and_exit;
+}
{noformat}

{noformat}
[ebadger@foo ~]$ sudo docker images --format={{json .}} --filter=dangling=false
Template parsing error: template: :1: unclosed action
[ebadger@foo ~]$ docker --version
Docker version 1.13.1, build 4ef4b30/1.13.1
{noformat}
The else clause syntax doesn't seem to work for me. Did I do something wrong?

This patch assumes that the client can access the Docker Registry. I'm not 
super familiar with native services, but I imagine this client runs on a 
gateway node somewhere outside of the cluster itself. With that, I imagine it 
is possible that the cluster itself can access the Docker Registry while the 
client can't. Or the Registry could require credentials to access it. Should we 
make this feature optional to get around those error cases? Another possible 
solution is to have the AM get the sha256 hash of the image that it is running 
in and then passing that sha to all of the containers that it starts. This 
would move the query into the Hadoop cluster itself. 

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
>   

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-08 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010815#comment-17010815
 ] 

Hadoop QA commented on YARN-9292:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} YARN-9292 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12963465/YARN-9292.006.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25349/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2020-01-08 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010811#comment-17010811
 ] 

Eric Yang commented on YARN-9292:
-

[~billie] Can you help with the review of this issue?  If I recall correctly 
container ID is used to determine the latest docker image tag used by the 
application.  Without container ID, it will not compute the latest image 
correctly for the given application.  It would be nice to have this issue 
closed for Hadoop 3.3.0 release.  Thanks

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-27 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803574#comment-16803574
 ] 

Eric Yang commented on YARN-9292:
-

{quote}For images, we probably need to write command file to a path independent 
of containers under nmPrivate directory. Our code can ensure that once the 
command is executed, the temp .cmd file is deleted.

I do think it is important that we don't expose this API with 
container/container id in it because there is no logical relation between the 
image and the container.{quote}

The cmd file is placed in application directory, and by deleting application 
directory by the current logic.  There is no additional code to be written for 
clean up.  The side benefit is that caller needs to know the running 
application ID to generate a container id that can call docker images command. 
This makes it more difficult for external party without running an app to get 
to docker image command.   The current code reduces exposure of docker images 
command to unauthorized user, and less likely to open security hole in the flow 
for PrivilegedOperation/Container-Executor initializing secure directory, and 
clean up.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-27 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803551#comment-16803551
 ] 

Chandni Singh commented on YARN-9292:
-

{quote} 
Real container id of the application master provides the already initialized 
path and .cmd file is stored in existing container directory. cmd file gets 
clean up when application is finished. Using randomly generated container id 
will not clean up as nicely.
{quote}
[~eyang] In patch 6, a random container id is already being created on the 
client side which is the {{ServiceScheduler}}. It is creating a container id 
from the appId and the current system time.

{code}
+  ContainerId cid = ContainerId
+  .newContainerId(ApplicationAttemptId.newInstance(appId, 1),
+  System.currentTimeMillis());
{code}
 
For images, we probably need to write command file to a path independent of 
containers under nmPrivate directory.  Our code can ensure that once the 
command is executed, the temp .cmd file is deleted.

I do think it is important that we don't expose this API with 
container/container id in it because there is no logical relation between the 
image and the container.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-27 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803524#comment-16803524
 ] 

Eric Yang commented on YARN-9292:
-

[~csingh] Thanks for the review.  I tried using randomly generated container id 
but the nmPrivate directory needs to be initialized and tracked separately.  
Real container id of the application master provides the already initialized 
path and .cmd file is stored in existing container directory.  cmd file gets 
clean up when application is finished.  Using randomly generated container id 
will not clean up as nicely.

I will make the logging change and add a new test for ServiceScheduler in the 
next patch.  Thanks

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-27 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803429#comment-16803429
 ] 

Chandni Singh commented on YARN-9292:
-

[~eyang] The rest API added here to find the image is independent of any 
container. So I don't think we should have the container and container id in 
the path.
{code}
  @Path("/container/{id}/docker/images/{name}")
{code}
If this is done because the DockerCommandExecutor needs a container id, we 
could change the implementation here to use a dummy container id. This 
implementation couldd be fixed later but the rest API will not be affected and 
will remain unchanged..
{code}
 String output = DockerCommandExecutor.executeDockerCommand(
  dockerImagesCommand, id, null, privOpExecutor, false, nmContext);
{code}
We could generate a dummy container id here instead of doing it in every client.

Some other nitpicks:

1. Log statements in ServiceScheduler can be parameterized which improves 
readability.
{code}
  LOG.info("Docker image: " + id + " maps to: " + imageId); ->
 LOG.info("Docker image: {} maps to : {}", id, imageId);
{code}

2. There aren't any tests for the new code added to {{ServiceScheduler}}. Will 
it be possible to add one?

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799489#comment-16799489
 ] 

Hadoop QA commented on YARN-9292:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  8m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
33s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 29 unchanged - 3 fixed = 29 total (was 32) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
18s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m  
6s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}118m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12963465/YARN-9292.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 21d9e13de863 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 43e421a |
| maven | 

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-22 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799450#comment-16799450
 ] 

Eric Yang commented on YARN-9292:
-

Patch 006 fixed a bug where docker image is not properly URL encoded.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch, 
> YARN-9292.006.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-22 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799442#comment-16799442
 ] 

Eric Yang commented on YARN-9292:
-

[~csingh] the second command format option is missing single quotes around 
{{.RepoDigests}}.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-22 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799438#comment-16799438
 ] 

Chandni Singh commented on YARN-9292:
-

[~eyang] I have a hadoop-build-1000:latest locally
{code} docker images hadoop-build-1000:latest --format='{{json .}}' {code}
gives the below info 
{code} 
{"Containers":"N/A","CreatedAt":"2018-12-18 23:08:27 -0800 
PST","CreatedSince":"3 months 
ago","Digest":"\u003cnone\u003e","ID":"c9e7cc96aa61","Repository":"hadoop-build-1000","SharedSize":"N/A","Size":"2.01GB","Tag":"latest","UniqueSize":"N/A","VirtualSize":"2.013GB"}
{code}

However,
{code} docker image inspect hadoop-build-1000:latest --format={{.RepoDigests}}  
{code}
 doesn't return anything. 
The output of this command is 
{code}
[]
{code}



> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798627#comment-16798627
 ] 

Hadoop QA commented on YARN-9292:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 56s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
37s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 29 unchanged - 3 fixed = 29 total (was 32) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
58s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
52s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}122m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12963337/YARN-9292.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux e3b7efb1e845 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 90afc9a |
| maven | 

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-21 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798561#comment-16798561
 ] 

Eric Yang commented on YARN-9292:
-

If yarn.nodemanager.runtime.linux.docker.image-update set to false:

Patch 4 will fail the application, if the docker image can not be resolved in 
the application master node.
Patch 5 will allow the application to proceed with :latest tag without 
synchronize them.

Behavior of patch 4 is more restricted, where patch 5 is a little more 
flexible.  Patch 4 behavior seems more correct to me, but I put patch 5 
behavior out there for feedback.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch, 
> YARN-9292.003.patch, YARN-9292.004.patch, YARN-9292.005.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798356#comment-16798356
 ] 

Hadoop QA commented on YARN-9292:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
39s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 41s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 29 unchanged - 3 fixed = 29 total (was 32) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
47s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
44s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}115m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12963300/YARN-9292.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 1c06d96bc04a 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9f1c017 |
| maven | 

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797718#comment-16797718
 ] 

Hadoop QA commented on YARN-9292:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  8m  5s{color} | 
{color:red} hadoop-yarn-project_hadoop-yarn generated 3 new + 0 unchanged - 0 
fixed = 3 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
56s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 29 unchanged - 3 fixed = 29 total (was 32) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m  
0s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
52s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}116m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12963191/YARN-9292.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 83701fb8cd0e 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797588#comment-16797588
 ] 

Hadoop QA commented on YARN-9292:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  7m  8s{color} | 
{color:red} hadoop-yarn-project_hadoop-yarn generated 3 new + 0 unchanged - 0 
fixed = 3 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m  
8s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 13s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 7 new + 31 unchanged - 0 fixed = 38 total (was 31) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 29s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
14s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}105m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.nodemanager.webapp.TestNMWebServices |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12963171/YARN-9292.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux aa9289340ee9 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-20 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797484#comment-16797484
 ] 

Eric Yang commented on YARN-9292:
-

Docker images command has some bugs.  When calling docker images --format 
'{{json .}}', the digest output produces incorrect result.  Hence, we have to 
modify the docker images command to use docker image inspect -f 
'{{.RepoDigests}}' for getting digest information.  Patch 002 make adjustments 
to ensure we work around docker bugs.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch, YARN-9292.002.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783936#comment-16783936
 ] 

Hadoop QA commented on YARN-9292:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
37s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 18s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m  
3s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 24s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 4 new + 24 unchanged - 0 fixed = 28 total (was 24) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
14s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
28s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
10s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}110m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
|  |  Dead store to dockerImagesCommand in 
org.apache.hadoop.yarn.server.nodemanager.webapp.NMWebServices.getImageId(String)
  At 
NMWebServices.java:org.apache.hadoop.yarn.server.nodemanager.webapp.NMWebServices.getImageId(String)
  At NMWebServices.java:[line 718] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9292 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-9292) Implement logic to keep docker image consistent in application that uses :latest tag

2019-03-04 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783851#comment-16783851
 ] 

Eric Yang commented on YARN-9292:
-

Patch 001 implements:
# Modified YARN Service AM to perform lookup of docker image id using node 
manager REST API.
# Node Manager REST API to obtain docker image id from image name.

There is a TODO section in the patch which depends on YARN-9249 to implement 
PrivilegedOperation to invoke docker image command using DockerImageCommand 
that was implemented in YARN-9245.

> Implement logic to keep docker image consistent in application that uses 
> :latest tag
> 
>
> Key: YARN-9292
> URL: https://issues.apache.org/jira/browse/YARN-9292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9292.001.patch
>
>
> Docker image with latest tag can run in YARN cluster without any validation 
> in node managers. If a image with latest tag is changed during containers 
> launch. It might produce inconsistent results between nodes. This is surfaced 
> toward end of development for YARN-9184 to keep docker image consistent 
> within a job. One of the ideas to keep :latest tag consistent for a job, is 
> to use docker image command to figure out the image id and use image id to 
> propagate to rest of the container requests. There are some challenges to 
> overcome:
>  # The latest tag does not exist on the node where first container starts. 
> The first container will need to download the latest image, and find image 
> ID. This can introduce lag time for other containers to start.
>  # If image id is used to start other container, container-executor may have 
> problems to check if the image is coming from a trusted source. Both image 
> name and ID must be supply through .cmd file to container-executor. However, 
> hacker can supply incorrect image id and defeat container-executor security 
> checks.
> If we can over come those challenges, it maybe possible to keep docker image 
> consistent with one application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org