never mind I think its just the GC taking its time while I got many gigabytes of unused cached rdds that I cannot get rid of easily On Jul 26, 2014 4:44 PM, "Koert Kuipers" <ko...@tresata.com> wrote:
> i have graphx queries running inside a service where i collect the results > to the driver and do not hold any references to the rdds involved in the > queries. my assumption was that with the references gone spark would go and > remove the cached rdds from memory (note, i did not cache them, graphx did). > > yet they hang around... > > is my understanding of how the ContextCleaner works incorrect? or could it > be that grapx holds some references internally to rdds, preventing garbage > collection? maybe even circular references? >