Michael,
I don't know what's your environment but if it's Cloudera, you should be able 
to see the link to your master in the Hue.
Thanks 

    On Thursday, January 7, 2016 5:03 PM, Michael Pisula 
<michael.pis...@tngtech.com> wrote:
 

  I had tried several parameters, including --total-executor-cores, no effect.
 As for the port, I tried 7077, but if I remember correctly I got some kind of 
error that suggested to try 6066, with which it worked just fine (apart from 
this issue here).
 
 Each worker has two cores. I also tried increasing cores, again no effect. I 
was able to increase the number of cores the job was using on one worker, but 
it would not use any other worker (and it would not start if the number of 
cores the job wanted was higher than the number available on one worker).
 
 On 07.01.2016 22:51, Igor Berman wrote:
  
 read about --total-executor-cores not sure why you specify port 6066 in 
master...usually it's 7077
 verify in master ui(usually port 8080) how many cores are there(depends on 
other configs, but usually workers connect to master with all their cores)   
 On 7 January 2016 at 23:46, Michael Pisula <michael.pis...@tngtech.com> wrote:
 
  Hi,
 
 I start the cluster using the spark-ec2 scripts, so the cluster is in 
stand-alone mode.
 Here is how I submit my job:
 spark/bin/spark-submit --class demo.spark.StaticDataAnalysis --master 
spark://<host>:6066 --deploy-mode cluster demo/Demo-1.0-SNAPSHOT-all.jar
 
 Cheers,
 Michael  
 
 On 07.01.2016 22:41, Igor Berman wrote:
  
 share how you submit your job what cluster(yarn, standalone)  
 On 7 January 2016 at 23:24, Michael Pisula <michael.pis...@tngtech.com> wrote:
 
Hi there,
 
 I ran a simple Batch Application on a Spark Cluster on EC2. Despite having 3
 Worker Nodes, I could not get the application processed on more than one
 node, regardless if I submitted the Application in Cluster or Client mode.
 I also tried manually increasing the number of partitions in the code, no
 effect. I also pass the master into the application.
 I verified on the nodes themselves that only one node was active while the
 job was running.
 I pass enough data to make the job take 6 minutes to process.
 The job is simple enough, reading data from two S3 files, joining records on
 a shared field, filtering out some records and writing the result back to
 S3.
 
 Tried all kinds of stuff, but could not make it work. I did find similar
 questions, but had already tried the solutions that worked in those cases.
 Would be really happy about any pointers.
 
 Cheers,
 Michael
 
 
 
 --
 View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-job-uses-only-one-Worker-tp25909.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
---------------------------------------------------------------------
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org
 
 
  
  
 
    -- 
Michael Pisula * michael.pis...@tngtech.com * +49-174-3180084
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082
  
  
  
 
 -- 
Michael Pisula * michael.pis...@tngtech.com * +49-174-3180084
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082
 

  

Reply via email to