[
https://issues.apache.org/jira/browse/SPARK-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yongjia Wang updated SPARK-3512:
--------------------------------
Description: I believe this would be a common scenario that the yarn
cluster runs behind a firewall, while people want to run spark driver locally
for best interactivity experience. You would have full control of local
resource that can be accessed by the client as opposed to be limited to the
spark-shell if you would do the conventional way to ssh to the remote host
inside the firewall. For example, using ipython notebook, or more fancy IDEs,
etc. Installing anything you want on the remote host is usually not an option.
A potential solution is to setup socks proxy on your local machine outside of
the firewall through shh tunneling (ssh -D <local-proxy-port>
<user>@<remote-host>) into some work station inside the firewall. Then the
spark yarn-client only needs to talk to the cluster through this proxy without
the need of changing any configurations. Does this sound feasible? Maybe VPN is
the right solution? (was: I believe this would be a common scenario that the
yarn cluster runs behind a firewall, while people want to run spark driver
locally for best interactivity experience. You would have full control of local
resource that can be accessed by the client as opposed to be limited to the
spark-shell if you would do the conventional way to ssh to the remote host
inside the firewall. For example, using ipython notebook, or more fancy IDEs,
etc. Installing anything you want on the remote host is usually not an option.
A potential solution is to setup socks proxy on your local machine outside of
the firewall through shh tunneling (ssh -D <local-proxy-port>
<user>@<remote-host>) into some work station inside the firewall. Then the
spark yarn-client only needs to talk to the cluster through this proxy without
the need of changing any configurations. Does this sound feasible?)
> yarn-client through socks proxy
> -------------------------------
>
> Key: SPARK-3512
> URL: https://issues.apache.org/jira/browse/SPARK-3512
> Project: Spark
> Issue Type: Wish
> Components: YARN
> Reporter: Yongjia Wang
>
> I believe this would be a common scenario that the yarn cluster runs behind a
> firewall, while people want to run spark driver locally for best
> interactivity experience. You would have full control of local resource that
> can be accessed by the client as opposed to be limited to the spark-shell if
> you would do the conventional way to ssh to the remote host inside the
> firewall. For example, using ipython notebook, or more fancy IDEs, etc.
> Installing anything you want on the remote host is usually not an option. A
> potential solution is to setup socks proxy on your local machine outside of
> the firewall through shh tunneling (ssh -D <local-proxy-port>
> <user>@<remote-host>) into some work station inside the firewall. Then the
> spark yarn-client only needs to talk to the cluster through this proxy
> without the need of changing any configurations. Does this sound feasible?
> Maybe VPN is the right solution?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]