in  ALS, I guess all the iteration’s rdds are referenced by its next 
iteration’s rdd, so all the shuffle data will not be deleted until the als job 
finished…

I guess checkpoint could solve my problem, do you know checkpoint?

> 在 2015年3月3日,下午4:18,nitin [via Apache Spark User List] 
> <ml-node+s1001560n21889...@n3.nabble.com> 写道:
> 
> Shuffle write will be cleaned if it is not referenced by any object 
> directly/indirectly. There is a garbage collector written inside spark which 
> periodically checks for weak references to RDDs/shuffle write/broadcast and 
> deletes them. 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://apache-spark-user-list.1001560.n3.nabble.com/how-to-clean-shuffle-write-each-iteration-tp21886p21889.html
>  
> <http://apache-spark-user-list.1001560.n3.nabble.com/how-to-clean-shuffle-write-each-iteration-tp21886p21889.html>
> To unsubscribe from how to clean shuffle write each iteration, click here 
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=21886&code=bGlzZW5kb25nQDE2My5jb218MjE4ODZ8MjQ0MTU2NDA4>.
> NAML 
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/how-to-clean-shuffle-write-each-iteration-tp21886p21890.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to