Will Job server work here? http://engineering.ooyala.com/blog/open-sourcing-our-spark-job-server
Mayur Rustagi Ph: +919632149971 h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com https://twitter.com/mayur_rustagi On Sat, Jan 25, 2014 at 10:46 PM, Kapil Malik <[email protected]> wrote: > Thanks a lot Mark and Christopher for your prompt replies and > clarification. > > > > Regards, > > > > Kapil Malik | [email protected] > > > > *From:* Christopher Nguyen [mailto:[email protected]] > *Sent:* 25 January 2014 22:34 > *To:* [email protected] > *Subject:* RE: Can I share the RDD between multiprocess > > > > Kapil, that's right, your #2 is the pattern I was referring to. Of course > it could be Tomcat or something even lighter weight as long as you define > some suitable client/server protocol. > > Sent while mobile. Pls excuse typos etc. > > On Jan 25, 2014 6:03 AM, "Kapil Malik" <[email protected]> wrote: > > Hi Christopher, > > > > “make a "server" out of that JVM, and serve up (via HTTP/THRIFT, etc.) > some kind of reference to those RDDs to multiple clients of that server” > > > > Can you kindly hint at any starting points regarding your suggestion? > > In my understanding, SparkContext constructor creates an Akka actor system > and starts a jetty UI server. So can we somehow use / tweak the same to > serve to multiple clients? Or can we simply construct a spark context > inside a Java server (like Tomcat) ? > > > > Regards, > > > > Kapil Malik | [email protected] | 33430 / 8800836581 <%208800836581> > > > > *From:* Christopher Nguyen [mailto:[email protected]] > *Sent:* 25 January 2014 12:00 > *To:* [email protected] > *Subject:* Re: Can I share the RDD between multiprocess > > > > D.Y., it depends on what you mean by "multiprocess". > > > > RDD lifecycles are currently limited to a single SparkContext. So to > "share" RDDs you need to somehow access the same SparkContext. > > > > This means one way to share RDDs is to make sure your accessors are in the > same JVM that started the SparkContext. > > > > Another is to make a "server" out of that JVM, and serve up (via > HTTP/THRIFT, etc.) some kind of reference to those RDDs to multiple clients > of that server, even though there is only one SparkContext (held by the > server). We have built a server product using this pattern so I know it can > work well. > > > -- > > Christopher T. Nguyen > > Co-founder & CEO, Adatao <http://adatao.com> > > linkedin.com/in/ctnguyen > > > > > > On Fri, Jan 24, 2014 at 6:06 PM, D.Y Feng <[email protected]> wrote: > > How can I share the RDD between multiprocess? > > > -- > > > DY.Feng(叶毅锋) > yyfeng88625@twitter > Department of Applied Mathematics > Guangzhou University,China > [email protected] > > > >
