Re: Spark is only using one worker machine when more are available

Jhon Anderson Cardenas Diaz Wed, 11 Apr 2018 07:43:55 -0700

Hi, could you please share the environment variables values that you are
sending when you run the jobs, spark version, etc.. more details.
Btw, you should take a look on SPARK_WORKER_INSTANCES and SPARK_WORKER_CORES
if you are using spark 2.0.0
<https://spark.apache.org/docs/preview/spark-standalone.html>.


Regards.

2018-04-11 4:10 GMT-05:00 宋源栋 <yuandong.s...@greatopensource.com>:

>
>
> Hi all,
>
> I hava a standalone mode spark cluster without HDFS with 10 machines that
> each one has 40 cpu cores and 128G RAM.
>
> My application is a sparksql application that reads data from database
> "tpch_100g" in mysql and run tpch queries. When loading tables from myql to
> spark, I spilts the biggest table "lineitem" into 600 partitions.
>
> When my application runs, there are only 40 executor(spark.executor.memory
> = 1g, spark.executor.cores = 1) in executor page of spark application web
> and all executors are on the same mathine. It is too slowly that all tasks
> are parallelly running in only one mathine.
>
>
>
>

Re: Spark is only using one worker machine when more are available

Reply via email to