Hi, I am trying to figure out a way to find the size of *persisted *dataframes using the *sparkContext.getRDDStorageInfo() * RDDStorageInfo object has information related to the number of bytes stored in memory and on disk.
For eg: I have 3 dataframes which i have cached. df1.cache() df2.cache() df3.cache() The call to sparkContext.getRDDStorageInfo() would return array of 3 objects of RDDStorageInfo since I have cached 3 data frames. But there is no way to map between df1,df2,df3 to each of the RDDStorageInfo objects. Is there any better way out? Thanks, Baahu