Re: spark shared RDD

Calvin Jia Wed, 09 Dec 2015 14:46:13 -0800

Hi Ben,

Tachyon can be used to share data between spark jobs. If you specify the
input to your jobs as a Tachyon path, you can leverage Tachyon's memory
centric storage on reads, improving the performance when reading the same
dataset multiple times. The examples on this page may be helpful:
http://tachyon-project.org/documentation/Running-Spark-on-Tachyon.html


Hope this helps,
Calvin

On Tue, Nov 10, 2015 at 2:24 AM, Ben <laurent...@gmail.com> wrote:

> Hi,
> After reading some documentations about spark and ignite,
> I am wondering if shared RDD from ignite can be used to share data in
> memory without any duplication between multiple spark jobs.
> Running on mesos I can collocate them, but will this be enough to avoid
> memory duplication or not?
> I am also confused by Tachyon usage compare to apache ignite
> which seems to be overlapping at some points.
> Thanks for your help
> Regards
>

Re: spark shared RDD

Reply via email to