unpersist works on storage memory not execution memory. So I do not think
you can flush it out of memory if you have not cached it using cache or
something like below in the first place.
I believe the recent versions of Spark deploy Least Recently Used
(LRU) mechanism to flush unused data out of memory much like RBMS cache
management. I know LLDAP does that.
Dr Mich Talebzadeh
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On 22 September 2016 at 18:09, Hanumath Rao Maduri <hanu....@gmail.com>
> Hello Aditya,
> After an intermediate action has been applied you might want to call
> rdd.unpersist() to let spark know that this rdd is no longer required.
> On Thu, Sep 22, 2016 at 7:54 AM, Aditya <aditya.calangutkar@augmentiq.
> co.in> wrote:
>> Suppose I have two RDDs
>> val textFile = sc.textFile("/user/emp.txt")
>> val textFile1 = sc.textFile("/user/emp1.xt")
>> Later I perform a join operation on above two RDDs
>> val join = textFile.join(textFile1)
>> And there are subsequent transformations without including textFile and
>> textFile1 further and an action to start the execution.
>> When action is called, textFile and textFile1 will be loaded in memory
>> first. Later join will be performed and kept in memory.
>> My question is once join is there memory and is used for subsequent
>> execution, what happens to textFile and textFile1 RDDs. Are they still kept
>> in memory untill the full lineage graph is completed or is it destroyed
>> once its use is over? If it is kept in memory, is there any way I can
>> explicitly remove it from memory to free the memory?
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org