Re: spark-local dir running out of space during long ALS run

Davies Liu Mon, 16 Feb 2015 16:30:41 -0800

For the last question, you can trigger GC in JVM from Python by :

sc._jvm.System.gc()


On Mon, Feb 16, 2015 at 4:08 PM, Antony Mayi
<[email protected]> wrote:
> thanks, that looks promissing but can't find any reference giving me more
> details - can you please point me to something? Also is it possible to force
> GC from pyspark (as I am using pyspark)?
>
> thanks,
> Antony.
>
>
> On Monday, 16 February 2015, 21:05, Tathagata Das
> <[email protected]> wrote:
>
>
>
> Correct, brute force clean up is not useful. Since Spark 1.0, Spark can do
> automatic cleanup of files based on which RDDs are used/garbage collected by
> JVM. That would be the best way, but depends on the JVM GC characteristics.
> If you force a GC periodically in the driver that might help you get rid of
> files in the workers that are not needed.
>
> TD
>
> On Mon, Feb 16, 2015 at 12:27 AM, Antony Mayi <[email protected]>
> wrote:
>
> spark.cleaner.ttl is not the right way - seems to be really designed for
> streaming. although it keeps the disk usage under control it also causes
> loss of rdds and broadcasts that are required later leading to crash.
>
> is there any other way?
> thanks,
> Antony.
>
>
> On Sunday, 15 February 2015, 21:42, Antony Mayi <[email protected]>
> wrote:
>
>
>
> spark.cleaner.ttl ?
>
>
> On Sunday, 15 February 2015, 18:23, Antony Mayi <[email protected]>
> wrote:
>
>
>
> Hi,
>
> I am running bigger ALS on spark 1.2.0 on yarn (cdh 5.3.0) - ALS is using
> about 3 billions of ratings and I am doing several trainImplicit() runs in
> loop within one spark session. I have four node cluster with 3TB disk space
> on each. before starting the job there is less then 8% of the disk space
> used. while the ALS is running I can see the disk usage rapidly growing
> mainly because of files being stored under
> yarn/local/usercache/user/appcache/application_XXX_YYY/spark-local-ZZZ-AAA.
> after about 10 hours the disk usage hits 90% and yarn kills the particular
> containers.
>
> am I missing doing some cleanup somewhere while looping over the several
> trainImplicit() calls? taking 4*3TB of disk space seems immense.
>
> thanks for any help,
> Antony.
>
>
>
>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: spark-local dir running out of space during long ALS run

Reply via email to