I am stuck trying to debug a memory leak issue.  What is the best way to 
find out what gc is doing?

My program generates and processes 10GB data every iteration.  I set the 
data variables to "nothing" and explicitly call gc() every time to make 
sure the space is reclaimed. However the memory usage keeps growing and 
ends up crashing the machine (unfortunately several hours into the run).   
The table below is the output of 'ps aux' at every iteration. The growth is 
irregular as can be seen from the RSS column below.   If I wasn't cleaning 
up properly I would expect a more regular growth every iteration.  The 
changes in RSS seem to be around 10GB, so I suspect Julia is failing to 
reclaim the memory from previous iterations sometimes.  The program is also 
multithreaded, it uses pmap to process data on multiple cores, I also 
suspect gc() may have an issue with multiple threads.  The Julia version is 
v0.3.5.  Any pointers would be appreciated.

best,
deniz


USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
dyuret   16245 79.7  9.1 176446240 12077528 ?  R<Ll 20:41   4:49 julia3 
gtrain.jl
dyuret   16245 74.8 17.2 187169612 22796900 ?  S<Ll 20:41   9:10 julia3 
gtrain.jl
dyuret   16245 73.4 25.0 197398544 33027964 ?  S<Ll 20:41  13:30 julia3 
gtrain.jl
dyuret   16245 72.8 17.7 187786336 23415824 ?  S<Ll 20:41  17:52 julia3 
gtrain.jl
dyuret   16245 72.4 17.6 187726364 23358316 ?  S<Ll 20:41  22:13 julia3 
gtrain.jl
dyuret   16245 72.0 25.0 197479144 33111112 ?  S<Ll 20:41  26:34 julia3 
gtrain.jl
dyuret   16245 71.8 25.1 197594204 33222044 ?  S<Ll 20:41  30:55 julia3 
gtrain.jl
dyuret   16245 71.6 32.3 207126888 42755400 ?  S<Ll 20:41  35:16 julia3 
gtrain.jl
dyuret   16245 71.5 32.1 206849040 42472956 ?  S<Ll 20:41  39:37 julia3 
gtrain.jl
dyuret   16245 71.4 25.5 198175184 33806360 ?  S<Ll 20:41  43:58 julia3 
gtrain.jl
dyuret   16245 71.3 25.5 198175184 33806740 ?  S<Ll 20:41  48:20 julia3 
gtrain.jl
dyuret   16245 71.2 25.6 198179280 33811084 ?  S<Ll 20:41  52:41 julia3 
gtrain.jl
dyuret   16245 71.2 33.0 208054076 43685884 ?  S<Ll 20:41  57:02 julia3 
gtrain.jl
dyuret   16245 71.2 25.8 198521392 34153384 ?  S<Ll 20:41  61:24 julia3 
gtrain.jl
dyuret   16245 71.1 32.6 207497648 43129636 ?  S<Ll 20:41  65:44 julia3 
gtrain.jl
dyuret   16245 71.1 32.1 206831844 42463844 ?  S<Ll 20:41  70:06 julia3 
gtrain.jl
dyuret   16245 71.0 39.8 217030332 52662460 ?  S<Ll 20:41  74:27 julia3 
gtrain.jl
dyuret   16245 71.0 32.6 207497648 43129780 ?  S<Ll 20:41  78:48 julia3 
gtrain.jl
dyuret   16245 71.0 33.4 208505564 44137820 ?  S<Ll 20:41  83:10 julia3 
gtrain.jl
dyuret   16245 71.0 32.6 207497648 43129904 ?  S<Ll 20:41  87:32 julia3 
gtrain.jl
dyuret   16245 70.9 33.0 208059856 43692112 ?  S<Ll 20:41  91:53 julia3 
gtrain.jl
dyuret   16245 70.9 40.8 218361940 53994196 ?  S<Ll 20:41  96:14 julia3 
gtrain.jl
dyuret   16245 70.9 47.5 227228820 62861076 ?  S<Ll 20:41 100:35 julia3 
gtrain.jl
dyuret   16245 70.9 47.5 227228820 62861076 ?  S<Ll 20:41 104:57 julia3 
gtrain.jl
dyuret   16245 70.9 47.0 226459420 62091680 ?  S<Ll 20:41 109:18 julia3 
gtrain.jl
dyuret   16245 70.9 47.0 226563016 62195276 ?  S<Ll 20:41 113:39 julia3 
gtrain.jl
dyuret   16245 70.9 47.0 226563016 62195276 ?  S<Ll 20:41 118:01 julia3 
gtrain.jl
dyuret   16245 70.8 54.5 236437812 72070076 ?  S<Ll 20:41 122:21 julia3 
gtrain.jl
dyuret   16245 70.8 54.3 236095700 71727964 ?  S<Ll 20:41 126:43 julia3 
gtrain.jl

Reply via email to