Github user mattf commented on a diff in the pull request:
https://github.com/apache/spark/pull/2444#discussion_r18027788
--- Diff: docs/spark-standalone.md ---
@@ -62,7 +62,12 @@ Finally, the following configuration options can be
passed to the master and wor
# Cluster Launch Scripts
-To launch a Spark standalone cluster with the launch scripts, you need to
create a file called `conf/slaves` in your Spark directory, which should
contain the hostnames of all the machines where you would like to start Spark
workers, one per line. The master machine must be able to access each of the
slave machines via password-less `ssh` (using a private key). For testing, you
can just put `localhost` in this file.
+To launch a Spark standalone cluster with the launch scripts, you need to
create a file called `conf/slaves` in your Spark directory,
+which should contain the hostnames of all the machines where you would
like to start Spark workers, one per line. If `conf/slaves`
+does not exist, the launch scripts use a list which contains single
hostname `localhost`. This can be used for testing.
+The master machine must be able to access each of the slave machines via
`ssh`. By default, `ssh` is executed in the background for parallel execution
for each slave machine.
+If you would like to use password authentication instead of
password-less(using a private key) for `ssh`, `ssh` does not work well in the
background.
+To avoid this, you can set a environment variable `SPARK_SSH_FOREGROUND`
to something like `yes` or `y` to execute `ssh` in the foreground.
--- End diff --
what about -
To launch a Spark standalone cluster with the launch scripts, you should
create a file called `conf/slaves` in your Spark directory, which must contain
the hostnames of all the machines where you intend to start Spark workers, one
per line. If `conf/slaves` does not exist, the launch scripts defaults to a
single machine (`localhost`), which is useful for testing. Note, the master
machine accesses each of the worker machines via `ssh`. By default, `ssh` is
run in parallel and requires password-less (using a private key) access to be
setup. If you do not have a password-less setup, you can set the environment
variable `SPARK_SSH_FOREGROUND` and serially provide a password for each worker.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]