What setting you are using for persist() or cache()
http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence On Tue, Oct 28, 2014 at 6:18 PM, shahab <[email protected]> wrote: > Hi, > > I have a standalone spark , where the executor is set to have 6.3 G memory > , as I am using two workers so in total there 12.6 G memory and 4 cores. > > I am trying to cache a RDD with approximate size of 3.2 G, but apparently > it is not cached as neither I can see " BlockManagerMasterActor: Added > rdd_XX in memory " nor the performance of running the tasks is improved > > But, why it is not cached when there is enough memory storage? > I tried with smaller RDDs. 1 or 2 G and it works, at least I could see > "BlockManagerMasterActor: > Added rdd_0_1 in memory" and improvement in results. > > Any idea what I am missing in my settings, or... ? > > thanks, > /Shahab >
