[ 
https://issues.apache.org/jira/browse/SPARK-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjia Wang updated SPARK-3512:
--------------------------------
    Description: I believe this would be a common scenario that the yarn 
cluster runs behind a firewall, while people want to run spark driver locally 
for best interactivity experience. You would have full control of local 
resource that can be accessed by the client as opposed to be limited to the 
spark-shell if you would do the conventional way to ssh to the remote host 
inside the firewall. For example, using ipython notebook, or more fancy IDEs, 
etc. Installing anything you want on the remote host is usually not an option. 
A potential solution is to setup socks proxy on your local machine outside of 
the firewall through shh tunneling (ssh -D <local-proxy-port> 
<user>@<remote-host>) into some work station inside the firewall. Then the 
spark yarn-client only needs to talk to the cluster through this proxy without 
the need of changing any configurations. Does this sound feasible? Maybe VPN is 
the right solution?  (was: I believe this would be a common scenario that the 
yarn cluster runs behind a firewall, while people want to run spark driver 
locally for best interactivity experience. You would have full control of local 
resource that can be accessed by the client as opposed to be limited to the 
spark-shell if you would do the conventional way to ssh to the remote host 
inside the firewall. For example, using ipython notebook, or more fancy IDEs, 
etc. Installing anything you want on the remote host is usually not an option. 
A potential solution is to setup socks proxy on your local machine outside of 
the firewall through shh tunneling (ssh -D <local-proxy-port> 
<user>@<remote-host>) into some work station inside the firewall. Then the 
spark yarn-client only needs to talk to the cluster through this proxy without 
the need of changing any configurations. Does this sound feasible?)

> yarn-client through socks proxy
> -------------------------------
>
>                 Key: SPARK-3512
>                 URL: https://issues.apache.org/jira/browse/SPARK-3512
>             Project: Spark
>          Issue Type: Wish
>          Components: YARN
>            Reporter: Yongjia Wang
>
> I believe this would be a common scenario that the yarn cluster runs behind a 
> firewall, while people want to run spark driver locally for best 
> interactivity experience. You would have full control of local resource that 
> can be accessed by the client as opposed to be limited to the spark-shell if 
> you would do the conventional way to ssh to the remote host inside the 
> firewall. For example, using ipython notebook, or more fancy IDEs, etc. 
> Installing anything you want on the remote host is usually not an option. A 
> potential solution is to setup socks proxy on your local machine outside of 
> the firewall through shh tunneling (ssh -D <local-proxy-port> 
> <user>@<remote-host>) into some work station inside the firewall. Then the 
> spark yarn-client only needs to talk to the cluster through this proxy 
> without the need of changing any configurations. Does this sound feasible? 
> Maybe VPN is the right solution?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to