>>>>> Gergely Daróczi <daroc...@rapporter.net> >>>>> on Thu, 10 Nov 2016 16:48:12 +0100 writes:
> Dear All, > I'm developing an R application running inside of a Java daemon on > multiple threads, and interacting with the parent daemon via stdin and > stdout. > Everything works perfectly fine except for having some memory leaks > somewhere. Simplified version of the R app: > while (TRUE) { > con <- file('stdin', open = 'r', blocking = TRUE) > line <- scan(con, what = character(0), nlines = 1, quiet = TRUE) > close(con) > } > This loop uses more and more RAM as time passes (see more on this > below), not sure why, and I have no idea currently on how to debug > this further. Can someone please try to reproduce it and give me some > hints on what is the problem? > Sample bash script to trigger an R process with such memory leak: > Rscript --vanilla -e "while(TRUE)cat(runif(1),'\n')" | Rscript > --vanilla -e "cat(Sys.getpid(),'\n');while(TRUE){con<-file('stdin',open='r',blocking=TRUE);line<-scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);gc()}" > Maybe you have to escape '\n' depending on your shell. > Thanks for reading this and any hints would be highly appreciated! I have no hints, sorry... but give some more "data": I've changed the above to *print* the gc() result every 1000th iteration, and after 100'000 iterations, there is still no memory increase from the point of view of R itself. However, monitoring the process (via 'htop', e.g.) shows about 1 MB per second increase in memory foot print of the process. One could argue that the error is with the OS / pipe / bash rather than with R itself... but I'm not expert enough to do argue here at all. Here's my version of your sample bash script and its output: $ Rscript --vanilla -e "while(TRUE)cat(runif(1),'\n')" | Rscript --vanilla -e "cat(Sys.getpid(),'\n');i <- 0; while(TRUE){con<-file('stdin',open='r',blocking=TRUE);line<-scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);a <- gc(); i <- i+1; if(i %% 1000 == 1) {cat('i=',i,'\\n'); print(a)} }" 11059 i= 1 used (Mb) gc trigger (Mb) max used (Mb) Ncells 83216 4.5 10000000 534.1 213529 11.5 Vcells 172923 1.4 16777216 128.0 562476 4.3 i= 1001 used (Mb) gc trigger (Mb) max used (Mb) Ncells 83255 4.5 10000000 534.1 213529 11.5 Vcells 172958 1.4 16777216 128.0 562476 4.3 ....... ............................................... ............................................... ............................................... i= 80001 used (Mb) gc trigger (Mb) max used (Mb) Ncells 83255 4.5 10000000 534.1 213529 11.5 Vcells 172958 1.4 16777216 128.0 562476 4.3 i= 81001 used (Mb) gc trigger (Mb) max used (Mb) Ncells 83255 4.5 10000000 534.1 213529 11.5 Vcells 172959 1.4 16777216 128.0 562476 4.3 i= 82001 used (Mb) gc trigger (Mb) max used (Mb) Ncells 83255 4.5 10000000 534.1 213529 11.5 Vcells 172959 1.4 16777216 128.0 562476 4.3 i= 83001 used (Mb) gc trigger (Mb) max used (Mb) Ncells 83255 4.5 10000000 534.1 213529 11.5 Vcells 172958 1.4 16777216 128.0 562476 4.3 i= 84001 used (Mb) gc trigger (Mb) max used (Mb) Ncells 83255 4.5 10000000 534.1 213529 11.5 Vcells 172958 1.4 16777216 128.0 562476 4.3 > Best, > Gergely > PS1 see the image posted at > http://stackoverflow.com/questions/40522584/memory-leak-with-closed-connections > on memory usage over time > PS2 the issue doesn't seem to be due to writing more data in the first > R app compared to what the second R app can handle, as I tried the > same with adding a Sys.sleep(0.01) in the first app and that's not an > issue at all in the real application > PS3 I also tried using stdin() instead of file('stdin'), but that did > not work well for the stream running on multiple threads started by > the same parent Java daemon > PS4 I've tried this on Linux using R 3.2.3 and 3.3.2 For me, it's Linux, too (Fedora 24), using 'R 3.3.2 patched'.. Martin ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel