[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609914#comment-16609914 ] Zian Chen commented on YARN-8523: - Offline discussed with Eric and Wangda, this feature involves creating a pipeline among NM, container-exec and docker exec which requires a lot of changes to container stack, create Umbrella Jira YARN-8762 to track progress. > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Zian Chen >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576945#comment-16576945 ] Zian Chen commented on YARN-8523: - Make sense. I'll work on provide an initial patch for this idea. Thanks [~eyang] > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576801#comment-16576801 ] Eric Yang commented on YARN-8523: - [~Zian Chen] # Without step 2 session management, the terminal session will terminate with Connection Closed when node manager restarts. User can retry with browser reload to obtain a new session. I think web socket connection is reliable enough to keep the connection alive. If it drops, user can always get a new session of docker exec. # There is nothing to handle on node manager shutdown or crash because remote connection closed will be displayed to browser. > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576697#comment-16576697 ] Zian Chen commented on YARN-8523: - Good point, I think we can make this Jira focus on building this pipline and create a second Jira for persistent docker exec state while NM restart. Two more questions here, # Should we give user sone kind of notification while NM restart and we are trying to resuming the docker exec? What if we get several retries to reconnect and don't succeed? We may need to give user some friendly reminder to avoid the misunderstanding of session been stuck for too long, right? # How to handle NM unexpected shutdown(like crash, etc) scenario? > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574174#comment-16574174 ] Eric Yang commented on YARN-8523: - [~Zian Chen] In today's meetup, [~jlowe] suggested to bind docker exec to a named pipe or socket. This will allow node manager restart while persisting docker exec state, and reconnect to the named pipe to resume docker exec session. This is one approach to workaround node manager restart while maintains a live session to docker exec. However, resuming docker exec session may introduce limitation where docker exec is stuck, and no way to start new sessions. Step one is get xtermjs connect to web socket serve by node manager that launches docker exec. Step 2 can refine the session mapping logic to create/resume named pipe to docker exec. I think it would be reasonable for step 2 to be created as a separate JIRA to contain the scope of the work. > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573946#comment-16573946 ] Zian Chen commented on YARN-8523: - [~eyang], thanks for raising this feature. This is very useful for live debug of container diagnosis. we can add a series of interactive commands to let user debug more effectively, like tail -f container log, container resource usage, etc. For handling nodemanager restart scenario, we can register a event listener to listen restart or shutdown signal of node manager web socket and respond in xterm js terminal accordingly, (like print out NM restart/shutdown message to user, etc) and do reconnect retries several times after typical nm restart interval. Again, if NM meet any unexpected issue which can not resume its service, that's something we can not solve on this interactive docker shell by itself and we should just give user reasonable alert message to inform the current situation (like retry failed with timeout, please check NM log to get more information, etc). I think pass command through NM web socket and reuse container-executor security check would be a good prototype we can build first without have too much burden on handling root daemon by carving another secure channel. > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568825#comment-16568825 ] Eric Yang commented on YARN-8523: - {quote} But how would that be done? Just have it constantly listening? That seems like a lot of pain and overhead for a use case that is probably rather rare. {quote} In node manager, create a rest api that accepts container id, and verifies the user credential matches the owner of the container id or YARN admin. If security access is granted, it start a web socket thread for the conversation until the connection is interrupted or terminated. Java ProcessBuilder and javax.websocket package make the entire process straight forward and light weight. I think this is a very useful feature for debugging a running container without limiting to log viewer for troubleshooting. {quote} I suppose it's possible to reconnect if we're using live-restore, but I don't think that is something that needs to be done in the first phase of this proposal. {quote} I am not betting a lot on live restore at this time. There are reports of docker exec doesn't work after live restore: https://github.com/moby/moby/issues/35873 . Time will tell if this can be improved. > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568721#comment-16568721 ] Eric Badger commented on YARN-8523: --- bq. The down side is the security model may become harder to enforce because we dig another tunnel into root daemon. Giving that we have invested quite a bit in security check in container-executor, reuse our investment is probably better than carving the alternate path. Agreed that it's good to keep the number of setuid binaries talking to the docker daemon to a minimum. The container-executor stays around until the completion of the container, so it could be used as an option. But how would that be done? Just have it constantly listening? That seems like a lot of pain and overhead for a use case that is probably rather rare. bq. Same problem also exists if docker daemon is restarted, it could interrupt docker exec as well. This unavoidable circumstances may not be solvable. Hence, I am ok with this draw back, but keeping an open mind for possible solutions. I suppose it's possible to reconnect if we're using live-restore, but I don't think that is something that needs to be done in the first phase of this proposal. > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568397#comment-16568397 ] Eric Yang commented on YARN-8523: - [~jlowe] Fair point. Without using node manager path, we still need another node manager equivalent agent to relay the commands. One alternate solution is to relay instruction from application master of YARN service directly to docker daemon. The down side is the security model may become harder to enforce because we dig another tunnel into root daemon. Giving that we have invested quite a bit in security check in container-executor, reuse our investment is probably better than carving the alternate path. Same problem also exists if docker daemon is restarted, it could interrupt docker exec as well. This unavoidable circumstances may not be solvable. Hence, I am ok with this draw back, but keeping an open mind for possible solutions. > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568291#comment-16568291 ] Jason Lowe commented on YARN-8523: -- Involving the nodemanager in the data path should consider how to handle nodemanager restarts. Therefore it would be nice if we could avoid having the nodemanager involved in the data transfer path. There are other pros and cons to doing so, but I wanted to raise awareness of the restart feature impacting this approach. > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567619#comment-16567619 ] Eric Yang commented on YARN-8523: - [~divayjindal] One possible solution for option 3, embed xtermjs on node manager UI, and implement a web socket servlet to forward data between docker exec -it, and browser. The design looks like this: {code} xtermjs -> nodemanager web socket -> container-executor -> docker exec -it {code} > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562260#comment-16562260 ] Eric Yang commented on YARN-8523: - There are a couple possible approaches to support docker exec on yarn: 1. System administrator approach: When system administrator needs to perform same operation on all instances of a component or service. It is possible to create a YARN cli command to pass simple docker exec command to targeted components and services. Stdout and stderr are aggregated, and report back to application master and YARN cli. This would be similar to kubectl exec on kubernetes. There is no session persistence to remember the directory, or hot key lookup because there is no terminal bind to the YARN cli. 2. Developer friendly approach: The alternative approach is to avoid building this into YARN framework but depend on docker container to run multi-processes container. This allows sshd to run inside the docker container. User can use ssh and pdsh to login to the docker container. Dumb Terminal and hot key can be supported depending on Linux bits in the docker image. 3. Build a pseudo terminal rewiring to YARN UI: There are software like [ttyd|https://tsl0922.github.io/ttyd/] that offers ability to share docker interactive terminal over javascript. It might be possible to modify the code to interface with container-executor started docker exec session to provide a full experience. Option 3 would be fun from research point of view, but it is more practical to build 1 for diagnosing production problems at scale. Which approach is most useful to the community? > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544756#comment-16544756 ] Eric Yang commented on YARN-8523: - You are welcome to work on this. Hadoop is built with community contributions. There are others who are also interested to work on this. This JIRA would be a great place for the collaboration. Submit your proposal and patches, and interested parties can discuss the details. > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544356#comment-16544356 ] Divay Jindal commented on YARN-8523: [~eyang] can I work on this issue? > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org