As far as I know you basically have two options: let partitions be
recomputed (possibly caching / persisting memory only), or persist to disk
(and memory) and suffer the cost of writing to disk. The question is which
will be more expensive in your case. My experience is you're better off
letting
In my problem I have a number of intermediate JavaRDDs and would like to
be able to look at their sizes without destroying the RDD for sibsequent
processing. persist will do this but these are big and perisist seems
expensive and I am unsure of which StorageLevel is needed, Is there a way
to