Thanks a lot Mark and Christopher for your prompt replies and clarification.
Regards, Kapil Malik | [email protected]<mailto:[email protected]> From: Christopher Nguyen [mailto:[email protected]] Sent: 25 January 2014 22:34 To: [email protected] Subject: RE: Can I share the RDD between multiprocess Kapil, that's right, your #2 is the pattern I was referring to. Of course it could be Tomcat or something even lighter weight as long as you define some suitable client/server protocol. Sent while mobile. Pls excuse typos etc. On Jan 25, 2014 6:03 AM, "Kapil Malik" <[email protected]<mailto:[email protected]>> wrote: Hi Christopher, “make a "server" out of that JVM, and serve up (via HTTP/THRIFT, etc.) some kind of reference to those RDDs to multiple clients of that server” Can you kindly hint at any starting points regarding your suggestion? In my understanding, SparkContext constructor creates an Akka actor system and starts a jetty UI server. So can we somehow use / tweak the same to serve to multiple clients? Or can we simply construct a spark context inside a Java server (like Tomcat) ? Regards, Kapil Malik | [email protected]<mailto:[email protected]> | 33430 / 8800836581 From: Christopher Nguyen [mailto:[email protected]<mailto:[email protected]>] Sent: 25 January 2014 12:00 To: [email protected]<mailto:[email protected]> Subject: Re: Can I share the RDD between multiprocess D.Y., it depends on what you mean by "multiprocess". RDD lifecycles are currently limited to a single SparkContext. So to "share" RDDs you need to somehow access the same SparkContext. This means one way to share RDDs is to make sure your accessors are in the same JVM that started the SparkContext. Another is to make a "server" out of that JVM, and serve up (via HTTP/THRIFT, etc.) some kind of reference to those RDDs to multiple clients of that server, even though there is only one SparkContext (held by the server). We have built a server product using this pattern so I know it can work well. -- Christopher T. Nguyen Co-founder & CEO, Adatao<http://adatao.com> linkedin.com/in/ctnguyen<http://linkedin.com/in/ctnguyen> On Fri, Jan 24, 2014 at 6:06 PM, D.Y Feng <[email protected]<mailto:[email protected]>> wrote: How can I share the RDD between multiprocess? -- DY.Feng(叶毅锋) yyfeng88625@twitter Department of Applied Mathematics Guangzhou University,China [email protected]<mailto:[email protected]>
