Is there a suggested workaround?  e.g. Split the array yourself instead of
using DArrays, or shut down the workers and restart them every iteration
etc.?

best,
deniz


On Mon, Mar 16, 2015 at 9:39 PM, Jake Bolewski <[email protected]>
wrote:

> I suspect you are running into
> https://github.com/JuliaLang/julia/issues/8912.
>
> Best,
> Jake
>
>
> On Monday, March 16, 2015 at 2:53:25 PM UTC-4, Deniz Yuret wrote:
>
>> I am stuck trying to debug a memory leak issue.  What is the best way to
>> find out what gc is doing?
>>
>> My program generates and processes 10GB data every iteration.  I set the
>> data variables to "nothing" and explicitly call gc() every time to make
>> sure the space is reclaimed. However the memory usage keeps growing and
>> ends up crashing the machine (unfortunately several hours into the run).
>> The table below is the output of 'ps aux' at every iteration. The growth is
>> irregular as can be seen from the RSS column below.   If I wasn't cleaning
>> up properly I would expect a more regular growth every iteration.  The
>> changes in RSS seem to be around 10GB, so I suspect Julia is failing to
>> reclaim the memory from previous iterations sometimes.  The program is also
>> multithreaded, it uses pmap to process data on multiple cores, I also
>> suspect gc() may have an issue with multiple threads.  The Julia version is
>> v0.3.5.  Any pointers would be appreciated.
>>
>> best,
>> deniz
>>
>>
>> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>> dyuret   16245 79.7  9.1 176446240 12077528 ?  R<Ll 20:41   4:49 julia3 
>> gtrain.jl
>> dyuret   16245 74.8 17.2 187169612 22796900 ?  S<Ll 20:41   9:10 julia3 
>> gtrain.jl
>> dyuret   16245 73.4 25.0 197398544 33027964 ?  S<Ll 20:41  13:30 julia3 
>> gtrain.jl
>> dyuret   16245 72.8 17.7 187786336 23415824 ?  S<Ll 20:41  17:52 julia3 
>> gtrain.jl
>> dyuret   16245 72.4 17.6 187726364 23358316 ?  S<Ll 20:41  22:13 julia3 
>> gtrain.jl
>> dyuret   16245 72.0 25.0 197479144 33111112 ?  S<Ll 20:41  26:34 julia3 
>> gtrain.jl
>> dyuret   16245 71.8 25.1 197594204 33222044 ?  S<Ll 20:41  30:55 julia3 
>> gtrain.jl
>> dyuret   16245 71.6 32.3 207126888 42755400 ?  S<Ll 20:41  35:16 julia3 
>> gtrain.jl
>> dyuret   16245 71.5 32.1 206849040 42472956 ?  S<Ll 20:41  39:37 julia3 
>> gtrain.jl
>> dyuret   16245 71.4 25.5 198175184 33806360 ?  S<Ll 20:41  43:58 julia3 
>> gtrain.jl
>> dyuret   16245 71.3 25.5 198175184 33806740 ?  S<Ll 20:41  48:20 julia3 
>> gtrain.jl
>> dyuret   16245 71.2 25.6 198179280 33811084 ?  S<Ll 20:41  52:41 julia3 
>> gtrain.jl
>> dyuret   16245 71.2 33.0 208054076 43685884 ?  S<Ll 20:41  57:02 julia3 
>> gtrain.jl
>> dyuret   16245 71.2 25.8 198521392 34153384 ?  S<Ll 20:41  61:24 julia3 
>> gtrain.jl
>> dyuret   16245 71.1 32.6 207497648 43129636 ?  S<Ll 20:41  65:44 julia3 
>> gtrain.jl
>> dyuret   16245 71.1 32.1 206831844 42463844 ?  S<Ll 20:41  70:06 julia3 
>> gtrain.jl
>> dyuret   16245 71.0 39.8 217030332 52662460 ?  S<Ll 20:41  74:27 julia3 
>> gtrain.jl
>> dyuret   16245 71.0 32.6 207497648 43129780 ?  S<Ll 20:41  78:48 julia3 
>> gtrain.jl
>> dyuret   16245 71.0 33.4 208505564 44137820 ?  S<Ll 20:41  83:10 julia3 
>> gtrain.jl
>> dyuret   16245 71.0 32.6 207497648 43129904 ?  S<Ll 20:41  87:32 julia3 
>> gtrain.jl
>> dyuret   16245 70.9 33.0 208059856 43692112 ?  S<Ll 20:41  91:53 julia3 
>> gtrain.jl
>> dyuret   16245 70.9 40.8 218361940 53994196 ?  S<Ll 20:41  96:14 julia3 
>> gtrain.jl
>> dyuret   16245 70.9 47.5 227228820 62861076 ?  S<Ll 20:41 100:35 julia3 
>> gtrain.jl
>> dyuret   16245 70.9 47.5 227228820 62861076 ?
>>
>> ...
>
>

Reply via email to