Correct, what I do to start workers is the equivalent of start-slaves.sh.
It ends up running the same command on the worker servers as start-slaves
does.
It definitively uses all workers, and workers starting later pick up work
as well. If you have a long running job, you can add workers dynamical
OK this is basically form my notes for Spark standalone. Worker process is
the slave process
[image: Inline images 2]
You start worker as you showed
$SPARK_HOME/sbin/start-slaves.sh
Now that picks up the worker host node names from $SPARK_HOME/conf/slaves
files. So you still have to tell Spark
On Thu, May 19, 2016 at 6:06 PM, Mathieu Longtin wrote:
> I'm looking to bypass the master entirely. I manage the workers outside of
> Spark. So I want to start the driver, the start workers that connect
> directly to the driver.
It should be possible to do that if you extend the interface I
ment
I'm looking to bypass the master entirely. I manage the workers outside of
Spark. So I want to start the driver, the start workers that connect
directly to the driver.
Anyway, it looks like I will have to live with our current solution for a
while.
On Thu, May 19, 2016 at 8:32 PM Marcelo Vanzin
Okay:
*host=my.local.server*
*port=someport*
This is the spark-submit command, which runs on my local server:
*$SPARK_HOME/bin/spark-submit --master spark://$host:$port
--executor-memory 4g python-script.py with args*
If I want 200 worker cores, I tell the cluster scheduler to run this
command on
Hi Mathieu,
There's nothing like that in Spark currently. For that, you'd need a
new cluster manager implementation that knows how to start executors
in those remote machines (e.g. by running ssh or something).
In the current master there's an interface you can implement to try
that if you really
In a normal operation we tell spark which node the worker processes can run
by adding the nodenames to conf/slaves.
Not very clear on this in your case all the jobs run locally with say 100
executor cores like below:
${SPARK_HOME}/bin/spark-submit \
--master local[*] \
Mostly, the resource management is not up to the Spark master.
We routinely start 100 executor-cores for 5 minute job, and they just quit
when they are done. Then those processor cores can do something else
entirely, they are not reserved for Spark at all.
On Thu, May 19, 2016 at 4:55 PM Mich Tal
Then in theory every user can fire multiple spark-submit jobs. do you cap
it with settings in $SPARK_HOME/conf/spark-defaults.conf , but I guess in
reality every user submits one job only.
This is an interesting model for two reasons:
- It uses parallel processing across all the nodes or mos
Driver memory is default. Executor memory depends on job, the caller
decides how much memory to use. We don't specify --num-executors as we want
all cores assigned to the local master, since they were started by the
current user. No local executor. --master=spark://localhost:someport. 1
core per e
Thanks Mathieu
So it would be interesting to see what resources allocated in your case,
especially the num-executors and executor-cores. I gather every node has
enough memory and cores.
${SPARK_HOME}/bin/spark-submit \
--master local[2] \
--driver-memory 4g \
The driver (the process started by spark-submit) runs locally. The
executors run on any of thousands of servers. So far, I haven't tried more
than 500 executors.
Right now, I run a master on the same server as the driver.
On Thu, May 19, 2016 at 3:49 PM Mich Talebzadeh
wrote:
> ok so you are us
ok so you are using some form of NFS mounted file system shared among the
nodes and basically you start the processes through spark-submit.
In Stand-alone mode, a simple cluster manager included with Spark. It does
the management of resources so it is not clear to me what you are referring
as work
No master and no node manager, just the processes that do actual work.
We use the "stand alone" version because we have a shared file system and a
way of allocating computing resources already (Univa Grid Engine). If an
executor were to die, we have other ways of restarting it, we don't need
the w
Hi Mathieu
What does this approach provide that the norm lacks?
So basically each node has its master in this model.
Are these supposed to be individual stand alone servers?
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
First a bit of context:
We use Spark on a platform where each user start workers as needed. This
has the advantage that all permission management is handled by the OS, so
the users can only read files they have permission to.
To do this, we have some utility that does the following:
- start a mast
16 matches
Mail list logo