Hi,
I am trying to figure out a way to find the size of *persisted *dataframes
using the *sparkContext.getRDDStorageInfo() *
RDDStorageInfo object has information related to the number of bytes stored
in memory and on disk.

For eg:
I have 3 dataframes which i have cached.
df1.cache()
df2.cache()
df3.cache()

The call to sparkContext.getRDDStorageInfo()  would return array of 3
objects of RDDStorageInfo since I have cached 3 data frames.
But there is no way to map between df1,df2,df3 to each of the
RDDStorageInfo objects.

Is there any better way out?

Thanks,
Baahu

Reply via email to