Jorge,

what you propose is not possible because the size of the output is unknown, 
that's why a dynamically growing PStream buffer is used - it cannot be 
pre-allocated.

Cheers,
Simon


> On Mar 17, 2015, at 1:37 PM, Martinez de Salinas, Jorge 
> <jorge.martinez-de-sali...@hp.com> wrote:
> 
> Hi,
> 
> I've been doing some tests using serialize() to a raw vector:
> 
>       df <- data.frame(runif(50e6,1,10))
>       ser <- serialize(df,NULL)
> 
> In this example the data frame and the serialized raw vector occupy ~400MB 
> each, for a total of ~800M. However the memory peak during serialize() is 
> ~1.2GB:
> 
>       $ cat /proc/15155/status |grep Vm
>       ...
>       VmHWM:   1207792 kB
>       VmRSS:    817272 kB
> 
> We work with very large data frames and in many cases this is killing R with 
> an "out of memory" error.
> 
> This is the relevant code in R 3.1.3 in src/main/serialize.c:2494
> 
>       InitMemOutPStream(&out, &mbs, type, version, hook, fun);
>       R_Serialize(object, &out);
>       val =  CloseMemOutPStream(&out);
> 
> The serialized object is being stored in a buffer pointed by out.data. Then 
> in CloseMemOutPStream() R copies the whole buffer to a newly allocated SEXP 
> object (the raw vector that stores the final result):
> 
>       PROTECT(val = allocVector(RAWSXP, mb->count));
>       memcpy(RAW(val), mb->buf, mb->count);
>       free_mem_buffer(mb);
>       UNPROTECT(1);
> 
> Before calling free_mem_buffer() the process is using ~1.2GB (the original 
> data frame + the serialization buffer + final serialized raw vector). 
> 
> One possible solution would be to allocate a buffer for the final raw vector 
> and store the serialization result directly into that buffer. This would 
> bring the memory peak down from ~1.2GB to ~800MB.
> 
> Thanks,
> -Jorge
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to