0.8.1 of Spark is released now , do you mean we can share cached RDDs using 
this version now?
  

 

 ------------------ Original ------------------
  From:  "Sriram Ramachandrasekaran"<[email protected]>;
 Date:  Jan 2, 2014
 To:  "user"<[email protected]>; 
 
 Subject:  Re: Reply: Reply: Any best practice for hardware configuration 
forthemasterserver in standalone cluster mode?

 

 Yes the driver would run on the machine from which you launch your spark job. 
As for sharing cached RDDs, I don't think it's possible up until 0.8.1. The 
RDDs are not available across spark contexts, if my understanding is right.  

 If you still want to share RDDs, then you might have write a single service 
that maintains the cached RDD and the various other apps that want to access 
that RDD talk to that service. If I understand right, Shark handles SQL queries 
like this.

 

 On Tue, Dec 31, 2013 at 7:46 PM, guxiaobo1982 <[email protected]> wrote:
  We have different developers sharing a Spark cluster, and we don't let 
developers touch the master server. Each of the developers will commit their 
application from their desktop, then does each driver run on their desktops?
 
 Buy the way, can developers share cached RDDs.
 
 
  

 

 ------------------ Original ------------------
  Sender: "Mayur Rustagi"<[email protected]>;
 Send time: Tuesday, Dec 31, 2013 10:11 PM
 To: "user"<[email protected]>; 
 
 Subject: Re: Reply: Any best practice for hardware configuration for 
themasterserver in standalone cluster mode?

 

 Driver is the process that manages the execution across the cluster. So say 
your application is a "sql query" then the system spawns a shark-cli-driver 
that uses spark framework, hdfs etc to execute the query and deliver result. 
All this happens automatically so you dont need to worry about it as a user of 
spark/shark framework. Just go for a bigger machine with a master.  


 
  Mayur Rustagi
Ph: +919632149971  http://www.sigmoidanalytics.com
  https://twitter.com/mayur_rustagi
 






 On Tue, Dec 31, 2013 at 7:01 PM, guxiaobo1982 <[email protected]> wrote:
  Thanks for your reply, I am new hand at Spark, does driver mean the server 
where user applications are commit?
 

  

 

 ------------------ Original ------------------
  Sender: "Mayur Rustagi"<[email protected]>;
 Send time: Tuesday, Dec 31, 2013 9:55 PM
 To: "user"<[email protected]>; 
 
 Subject: Re: Any best practice for hardware configuration for the masterserver 
in standalone cluster mode?

 

 Master server needs to be a little beefy as the driver runs on it. We ran into 
some issues around scaling due to master servers. You can offload the drivers 
to workers or other machines then the master server can be smaller.  Regards
Mayur

 
  Mayur Rustagi
Ph: +919632149971  http://www.sigmoidanalytics.com
  https://twitter.com/mayur_rustagi
 






 On Tue, Dec 31, 2013 at 6:48 PM, guxiaobo1982 <[email protected]> wrote:
  Him
 

 I read the following article regarding to hardware configurations for the 
worker servers in the standalone cluster mode, but what about the master server?
 

 http://spark.incubator.apache.org/docs/latest/hardware-provisioning.html
 

 

 Regards,
 

 Xiaobo Gu
 








 




 

-- 
It's just about how deep your longing is!

Reply via email to