Is there a suggested workaround? e.g. Split the array yourself instead of using DArrays, or shut down the workers and restart them every iteration etc.?
best, deniz On Mon, Mar 16, 2015 at 9:39 PM, Jake Bolewski <[email protected]> wrote: > I suspect you are running into > https://github.com/JuliaLang/julia/issues/8912. > > Best, > Jake > > > On Monday, March 16, 2015 at 2:53:25 PM UTC-4, Deniz Yuret wrote: > >> I am stuck trying to debug a memory leak issue. What is the best way to >> find out what gc is doing? >> >> My program generates and processes 10GB data every iteration. I set the >> data variables to "nothing" and explicitly call gc() every time to make >> sure the space is reclaimed. However the memory usage keeps growing and >> ends up crashing the machine (unfortunately several hours into the run). >> The table below is the output of 'ps aux' at every iteration. The growth is >> irregular as can be seen from the RSS column below. If I wasn't cleaning >> up properly I would expect a more regular growth every iteration. The >> changes in RSS seem to be around 10GB, so I suspect Julia is failing to >> reclaim the memory from previous iterations sometimes. The program is also >> multithreaded, it uses pmap to process data on multiple cores, I also >> suspect gc() may have an issue with multiple threads. The Julia version is >> v0.3.5. Any pointers would be appreciated. >> >> best, >> deniz >> >> >> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND >> dyuret 16245 79.7 9.1 176446240 12077528 ? R<Ll 20:41 4:49 julia3 >> gtrain.jl >> dyuret 16245 74.8 17.2 187169612 22796900 ? S<Ll 20:41 9:10 julia3 >> gtrain.jl >> dyuret 16245 73.4 25.0 197398544 33027964 ? S<Ll 20:41 13:30 julia3 >> gtrain.jl >> dyuret 16245 72.8 17.7 187786336 23415824 ? S<Ll 20:41 17:52 julia3 >> gtrain.jl >> dyuret 16245 72.4 17.6 187726364 23358316 ? S<Ll 20:41 22:13 julia3 >> gtrain.jl >> dyuret 16245 72.0 25.0 197479144 33111112 ? S<Ll 20:41 26:34 julia3 >> gtrain.jl >> dyuret 16245 71.8 25.1 197594204 33222044 ? S<Ll 20:41 30:55 julia3 >> gtrain.jl >> dyuret 16245 71.6 32.3 207126888 42755400 ? S<Ll 20:41 35:16 julia3 >> gtrain.jl >> dyuret 16245 71.5 32.1 206849040 42472956 ? S<Ll 20:41 39:37 julia3 >> gtrain.jl >> dyuret 16245 71.4 25.5 198175184 33806360 ? S<Ll 20:41 43:58 julia3 >> gtrain.jl >> dyuret 16245 71.3 25.5 198175184 33806740 ? S<Ll 20:41 48:20 julia3 >> gtrain.jl >> dyuret 16245 71.2 25.6 198179280 33811084 ? S<Ll 20:41 52:41 julia3 >> gtrain.jl >> dyuret 16245 71.2 33.0 208054076 43685884 ? S<Ll 20:41 57:02 julia3 >> gtrain.jl >> dyuret 16245 71.2 25.8 198521392 34153384 ? S<Ll 20:41 61:24 julia3 >> gtrain.jl >> dyuret 16245 71.1 32.6 207497648 43129636 ? S<Ll 20:41 65:44 julia3 >> gtrain.jl >> dyuret 16245 71.1 32.1 206831844 42463844 ? S<Ll 20:41 70:06 julia3 >> gtrain.jl >> dyuret 16245 71.0 39.8 217030332 52662460 ? S<Ll 20:41 74:27 julia3 >> gtrain.jl >> dyuret 16245 71.0 32.6 207497648 43129780 ? S<Ll 20:41 78:48 julia3 >> gtrain.jl >> dyuret 16245 71.0 33.4 208505564 44137820 ? S<Ll 20:41 83:10 julia3 >> gtrain.jl >> dyuret 16245 71.0 32.6 207497648 43129904 ? S<Ll 20:41 87:32 julia3 >> gtrain.jl >> dyuret 16245 70.9 33.0 208059856 43692112 ? S<Ll 20:41 91:53 julia3 >> gtrain.jl >> dyuret 16245 70.9 40.8 218361940 53994196 ? S<Ll 20:41 96:14 julia3 >> gtrain.jl >> dyuret 16245 70.9 47.5 227228820 62861076 ? S<Ll 20:41 100:35 julia3 >> gtrain.jl >> dyuret 16245 70.9 47.5 227228820 62861076 ? >> >> ... > >
