Re: Spark is only using one worker machine when more are available

2018-04-12 Thread Gourav Sengupta
>> Dataset jdbcDF = ss.read().jdbc(this.url, dt, connProp); >> jdbcDF.createOrReplaceTempView(tableInfo.tmp_table_name); >> } >> } >> >> >> // Then run a query and write the result set to mysql >&g

Re: Spark is only using one worker machine when more are available

2018-04-12 Thread Jhon Anderson Cardenas Diaz
connProp.put("rewriteBatchedStatements", "true"); > connProp.put("sessionVariables", "sql_log_bin=off"); > result.write().jdbc(this.dst_url, this.dst_table, connProp); > > > >

回复:Spark is only using one worker machine when more are available

2018-04-11 Thread 宋源栋
tchedStatements", "true"); connProp.put("sessionVariables", "sql_log_bin=off"); result.write().jdbc(this.dst_url, this.dst_table, connProp); --发件人:Jhon Anderson Cardenas Diaz <jhond

Re: Spark is only using one worker machine when more are available

2018-04-11 Thread Jhon Anderson Cardenas Diaz
Hi, could you please share the environment variables values that you are sending when you run the jobs, spark version, etc.. more details. Btw, you should take a look on SPARK_WORKER_INSTANCES and SPARK_WORKER_CORES if you are using spark 2.0.0

Spark is only using one worker machine when more are available

2018-04-11 Thread 宋源栋
Hi all, I hava a standalone mode spark cluster without HDFS with 10 machines that each one has 40 cpu cores and 128G RAM. My application is a sparksql application that reads data from database "tpch_100g" in mysql and run tpch queries. When loading tables from myql to spark, I spilts the