One can take a look into Tacheyon project to share the RDDs across various Spark contexts. On Jan 1, 2014 10:55 PM, "jasonliu" <[email protected]> wrote:
> Actually, we can’t even in 0.8.1 > > > > *发件人:* guxiaobo1982 [mailto:[email protected]] > *发送时间:* 2014年1月2日 12:51 > *收件人:* user > *主题:* Re: Reply: Reply: Any best practice for hardware configuration > forthemasterserver in standalone cluster mode? > > > > 0.8.1 of Spark is released now , do you mean we can share cached RDDs > using this version now? > > > > > > ------------------ Original ------------------ > > *From: * "Sriram Ramachandrasekaran"<[email protected]>; > > *Date: * Jan 2, 2014 > > *To: * "user"<[email protected]>; > > *Subject: * Re: Reply: Reply: Any best practice for hardware > configuration forthemasterserver in standalone cluster mode? > > > > Yes the driver would run on the machine from which you launch your spark > job. As for sharing cached RDDs, I don't think it's possible up until > 0.8.1. The RDDs are not available across spark contexts, if my > understanding is right. > > > > If you still want to share RDDs, then you might have write a single > service that maintains the cached RDD and the various other apps that want > to access that RDD talk to that service. If I understand right, Shark > handles SQL queries like this. > > > > On Tue, Dec 31, 2013 at 7:46 PM, guxiaobo1982 <[email protected]> wrote: > > We have different developers sharing a Spark cluster, and we don't let > developers touch the master server. Each of the developers will commit > their application from their desktop, then does each driver run on their > desktops? > > Buy the way, can developers share cached RDDs. > > > > > > ------------------ Original ------------------ > > *Sender:* "Mayur Rustagi"<[email protected]>; > > *Send time:* Tuesday, Dec 31, 2013 10:11 PM > > *To:* "user"<[email protected]>; > > *Subject:* Re: Reply: Any best practice for hardware configuration for > themasterserver in standalone cluster mode? > > > > Driver is the process that manages the execution across the cluster. So > say your application is a "sql query" then the system spawns a > shark-cli-driver that uses spark framework, hdfs etc to execute the query > and deliver result. All this happens automatically so you dont need to > worry about it as a user of spark/shark framework. Just go for a bigger > machine with a master. > > > > > Mayur Rustagi > Ph: +919632149971 > > h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com > > https://twitter.com/mayur_rustagi > > > > > > On Tue, Dec 31, 2013 at 7:01 PM, guxiaobo1982 <[email protected]> wrote: > > Thanks for your reply, I am new hand at Spark, does driver mean the server > where user applications are commit? > > > > > > > > ------------------ Original ------------------ > > *Sender:* "Mayur Rustagi"<[email protected]>; > > *Send time:* Tuesday, Dec 31, 2013 9:55 PM > > *To:* "user"<[email protected]>; > > *Subject:* Re: Any best practice for hardware configuration for the > masterserver in standalone cluster mode? > > > > Master server needs to be a little beefy as the driver runs on it. We ran > into some issues around scaling due to master servers. You can offload the > drivers to workers or other machines then the master server can be smaller. > > Regards > Mayur > > > Mayur Rustagi > Ph: +919632149971 > > h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com > > https://twitter.com/mayur_rustagi > > > > > > On Tue, Dec 31, 2013 at 6:48 PM, guxiaobo1982 <[email protected]> wrote: > > Him > > > > I read the following article regarding to hardware configurations for the > worker servers in the standalone cluster mode, but what about the master > server? > > > > http://spark.incubator.apache.org/docs/latest/hardware-provisioning.html > > > > > > Regards, > > > > Xiaobo Gu > > > > > > > > > > > > -- > It's just about how deep your longing is! >
