Sharing memory across applications

Tushar Khairnar Sun, 10 Aug 2014 01:56:14 -0700

Hi,

I am new to spark and just going through all different features and
integration projects, so this could be very naive question.


I have requirement where I want to access data stored into other
application. It would be nice if I can share Spark Worker node inside the
same JVM. From one of the docs page (
https://spark.apache.org/docs/latest/job-scheduling.html) it mentions its
not possible and lists different tactics.

*Note that none of the modes currently provide memory sharing across
applications. If you would like to share data this way, we recommend
running a single server application that can serve multiple requests by
querying the same RDDs. For example, the Shark
<http://shark.cs.berkeley.edu> JDBC server works this way for SQL queries.
In future releases, in-memory storage systems such as Tachyon
<http://tachyon-project.org> will provide another approach to share RDDs.*

So I have following questions

1. Can Spark re-use jvms i.e. long living node which have data cached
running different spark tasks originated from different sparkContexts?
2. Can I dictate RDD partitioning so that I can ensure data-locality when
RDD from Spark and Local data is joined?
3. Can worker node be embedded inside an existing JVM?

Thanks,
Regards,
Tushar

Sharing memory across applications

Reply via email to