Hey Roman, It looks like that pull request was never migrated to the Apache GitHub, but I like the idea. If you migrate it over, we can merge in something like this. In terms of the API, I’d just add a unpersist() method on each Broadcast object.
Matei On Dec 3, 2013, at 6:00 AM, Roman Pastukhov <[email protected]> wrote: > Hi, > > In iterative processes that use broadcasts they seem to cause memory usage > problems as they are left it memory. Unfortunately only way to remove them > now requires reflection hacks. > > TTL based cleaning would also remove JobConf broadcasts, moreover it requires > each iteration to perform within some predefined time frame, so it does not > seem like a good option. > > So I was wondering what happened to https://github.com/mesos/spark/pull/771 > and whether it makes sense to submit similar pull requests? > > PS.TTL cleanup also removes broadcast files on disk, does this mean that if > some RDD part that used some old broadcast needs to be recalculated because > of lost executor this will fail?
