@Song, I have called an action but it did not cache as you can see in the
provided screenshot on my original email. It has cahced into Disk but not
@Chin Wei Low, I have 15GB memory allocated which is more than the dataset
Any other suggestion please?
On 11 October 2016 at 03:34, Chin Wei Low <lowchin...@gmail.com> wrote:
> Your RDD is 5GB, perhaps it is too large to fit into executor's storage
> memory. You can refer to the Executors tab in Spark UI to check the
> available memory for storage for each of the executor.
> Chin Wei
> On Tue, Oct 11, 2016 at 6:14 AM, diplomatic Guru <diplomaticg...@gmail.com
> > wrote:
>> Hello team,
>> Spark version: 1.6.0
>> I'm trying to persist done data into memory for reusing them. However,
>> when I call rdd.cache() OR rdd.persist(StorageLevel.MEMORY_ONLY()) it
>> does not store the data as I can not see any rdd information under WebUI
>> (Storage Tab).
>> Therefore I tried rdd.persist(StorageLevel.MEMORY_AND_DISK()), for which
>> it stored the data into Disk only as shown in below screenshot:
>> [image: Inline images 2]
>> Do you know why the memory is not being used?
>> Is there a configuration in cluster level to stop jobs from storing data
>> into memory altogether?
>> Please let me know.