Luca,

On Oct 13, 2011, at 5:45 AM, Luca Meyer wrote:

> Hi,
> 
> I have to upload data from more than 200 separated excel pages and I am using 
> the read.xlsx function in the xlsx package. Each sheet is an articulated page 
> (made of more than one table plus extra data) and I need to load into a R 
> data frame the different elements I find in each page. This procedure needs 
> to be repeated several times - i.e. on more than one excel file.
> 
> Initially I did not have problems and the script run just fine providing the 
> desired data frame. After running a few hundreds sheets now I get this error 
> after I upload each page:
> 
> Errore in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  : 
>  java.lang.OutOfMemoryError: Java heap space
> 
> Has this to do with some sort of cache that is being filled up? Can anyone 
> suggest a solution to this error?
> 

Well, you're running out of memory on the Java side. First, did you run gc()? 
You should make sure that you don't keep any references around unnecessarily, 
since memory cannot be released until they are collected.

I did run a quick test with rJava memory profiling enabled and I see no leaks 
in rJava:

ginaz:~$ R -e 'library(xlsx); d=read.xlsx("/Users/urbanek/Downloads/Acme Coffin 
Company.xls",1); gc()' | ./mem.match
Loading required package: xlsxjars
Loading required package: rJava

SUM[Leaked objects]: 0
SUM[Used objects]: 253

So if you see any issues even after running gc(), it will be hard to trace. It 
could be in the Java code used by xlsx (which I don't know how to trace) or in 
theory some local places in the infrastructure that are not covered by the 
memory traces (the above counts Java objects referenced from R). You may have 
some luck using java.lang.management to look into Java usage. Also you can 
force Java to run its own garbage collector using
.jcall("java/lang/System",,"gc")
but I doubt that it will help, since Java would have run it before running out 
of memory.

Finally, you could increase the Java heap if you have enough memory - for 
example to use 2Gb:

library(rJava)
.jinit(parameters="-Xmx2g")

but that's just delaying the problem.

So, again, make sure you run gc() after you're done with each worksheet and see 
if the problem persists.

Cheers,
Simon

_______________________________________________
R-SIG-Mac mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Reply via email to