I also found this related post (seems to be the same problem):
https://groups.google.com/forum/m/#!msg/julia-users/q39vyGQF4Fs/pmbBdsiP6kQJ


On Monday, March 16, 2015, Jake Bolewski <[email protected]> wrote:

> Unfortunately for right now shutting down the processes at each iteration
> will probably be the simplest fix.  The underlying issue might be
> https://github.com/JuliaLang/julia/issues/6597 so splitting up the work
> yourself would not help there.  Whatever you try, it would be good to
> highlight the workaround in the linked issue if it helps to narrow down the
> problem.
>
> Best,
> Jake
>
> On Monday, March 16, 2015 at 4:03:34 PM UTC-4, Deniz Yuret wrote:
>>
>> Is there a suggested workaround?  e.g. Split the array yourself instead
>> of using DArrays, or shut down the workers and restart them every iteration
>> etc.?
>>
>> best,
>> deniz
>>
>>
>> On Mon, Mar 16, 2015 at 9:39 PM, Jake Bolewski <[email protected]>
>> wrote:
>>
>>> I suspect you are running into https://github.com/
>>> JuliaLang/julia/issues/8912.
>>>
>>> Best,
>>> Jake
>>>
>>>
>>> On Monday, March 16, 2015 at 2:53:25 PM UTC-4, Deniz Yuret wrote:
>>>
>>>> I am stuck trying to debug a memory leak issue.  What is the best way
>>>> to find out what gc is doing?
>>>>
>>>> My program generates and processes 10GB data every iteration.  I set
>>>> the data variables to "nothing" and explicitly call gc() every time to make
>>>> sure the space is reclaimed. However the memory usage keeps growing and
>>>> ends up crashing the machine (unfortunately several hours into the run).
>>>> The table below is the output of 'ps aux' at every iteration. The growth is
>>>> irregular as can be seen from the RSS column below.   If I wasn't cleaning
>>>> up properly I would expect a more regular growth every iteration.  The
>>>> changes in RSS seem to be around 10GB, so I suspect Julia is failing to
>>>> reclaim the memory from previous iterations sometimes.  The program is also
>>>> multithreaded, it uses pmap to process data on multiple cores, I also
>>>> suspect gc() may have an issue with multiple threads.  The Julia version is
>>>> v0.3.5.  Any pointers would be appreciated.
>>>>
>>>> best,
>>>> deniz
>>>>
>>>>
>>>> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>>>> dyuret   16245 79.7  9.1 176446240 12077528 ?  R<Ll 20:41   4:49 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 74.8 17.2 187169612 22796900 ?  S<Ll 20:41   9:10 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 73.4 25.0 197398544 33027964 ?  S<Ll 20:41  13:30 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 72.8 17.7 187786336 23415824 ?  S<Ll 20:41  17:52 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 72.4 17.6 187726364 23358316 ?  S<Ll 20:41  22:13 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 72.0 25.0 197479144 33111112 ?  S<Ll 20:41  26:34 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.8 25.1 197594204 33222044 ?  S<Ll 20:41  30:55 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.6 32.3 207126888 42755400 ?  S<Ll 20:41  35:16 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.5 32.1 206849040 42472956 ?  S<Ll 20:41  39:37 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.4 25.5 198175184 33806360 ?  S<Ll 20:41  43:58 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.3 25.5 198175184 33806740 ?  S<Ll 20:41  48:20 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.2 25.6 198179280 33811084 ?  S<Ll 20:41  52:41 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.2 33.0 208054076 43685884 ?  S<Ll 20:41  57:02 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.2 25.8 198521392 34153384 ?  S<Ll 20:41  61:24 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.1 32.6 207497648 43129636 ?  S<Ll 20:41  65:44 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.1 32.1 206831844 42463844 ?  S<Ll 20:41  70:06 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.0 39.8 217030332 52662460 ?  S<Ll 20:41  74:27 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.0 32.6 207497648 43129780 ?  S<Ll 20:41  78:48 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.0 33.4 208505564 44137820 ?  S<Ll 20:41  83:10 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 71.0 32.6 207497648 43129904 ?  S<Ll 20:41  87:32 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 70.9 33.0 208059856 43692112 ?  S<Ll 20:41  91:53 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 70.9 40.8 218361940 53994196 ?  S<Ll 20:41  96:14 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 70.9 47.5 227228820 62861076 ?  S<Ll 20:41 100:35 julia3 
>>>> gtrain.jl
>>>> dyuret   16245 70.9 47.5 227228820 62861076 ?
>>>>
>>>> ...
>>>
>>>
>>

Reply via email to