Hi Varadharajan, Linux uses a copy-on-write for the memory image of forked processes. Thus, you may also get significant memory savings by launching a single R process, loading your large data object, and then using fork::fork() to split off the other worker process.
-Greg Sent from my iPad > On Jul 16, 2014, at 5:07 AM, Varadharajan Mukundan <srinath...@gmail.com> > wrote: > > [Sending it again in plain text mode] > > Greetings, > > We've a fairly large dataset (around 60GB) to be loaded and crunched > in real time. The kind of data operations that will be performed on > this data are simple read only aggregates after filtering the > data.table instance based on the parameters that will passed in real > time. We need to have more than one instance of such R process running > to serve different testing environments (each testing environment has > fairly identical dataset but do have a *small amount of changes*). As > we all know, data.table loads the entire dataset into memory for > processing and hence we are facing a constraint on number of such > process that we could run on the machine. On a 128GB RAM machine, we > are coming up with ways in which we could reduce the memory footprint > so that we can try to spawn more instances and use the resources > efficiently. One of the approaches we tried out was memory > de-duplication using UKSM > (http://kerneldedup.org/en/projects/uksm/introduction), given that we > did have few idle cpu cores. Outcome of the experiment was quite > impressive, considering that the effort to set it up was quite less > and the entire approach considers the application layer as a black > box. > > Quick snapshot of the results: > 1 Instance (without UKSM): ~60GB RAM was being used > 1 Instance (with UKSM): ~53 GB RAM was being used > > 2 Instance (without UKSM): ~125GB RAM was being used > 2 Instance (with UKSM): ~81 GB RAM was being used > > We can see that around 44 GB of RAM was saved after UKSM merged > similar pages and all this for a compromise of 1 CPU core on a 48 > core machine. We did not feel any noticeable degradation of > performance because the data is refreshed by a batch job only once > (every morning); UKSM gets in at this time and performs the same page > merging and for the rest of day, its just read only analysis. The kind > of queries we fire on the dataset at most scans 2-3GB of the entire > dataset and hence the query subset spike was low as well. > > We're interested in knowing if this is a plausible solution to this > problem? Any other points/solutions that we should be considering? > > On Tue, Jul 15, 2014 at 9:25 PM, Varadharajan Mukundan > <srinath...@gmail.com> wrote: >> Greetings, >> >> We've a fairly large dataset (around 60GB) to be loaded and crunched in real >> time. The kind of data operations that will be performed on this data are >> simple read only aggregates after filtering the data.table instance based on >> the parameters that will passed in real time. We need to have more than one >> instance of such R process running to serve different testing environments >> (each testing environment has fairly identical dataset but do have a *small >> amount of changes*). As we all know, data.table loads the entire dataset >> into memory for processing and hence we are facing a constraint on number of >> such process that we could run on the machine. On a 128GB RAM machine, we >> are coming up with ways in which we could reduce the memory footprint so >> that we can try to spawn more instances and use the resources efficiently. >> One of the approaches we tried out was memory de-duplication using UKSM >> (http://kerneldedup.org/en/projects/uksm/introduction), given that we did >> have few idle cpu cores. Outcome of the experiment was quite impressive, >> considering that the effort to set it up was quite less and the entire >> approach considers the application layer as a black box. >> >> Quick snapshot of the results: >> 1 Instance (without UKSM): ~60GB RAM was being used >> 1 Instance (with UKSM): ~53 GB RAM was being used >> >> 2 Instance (without UKSM): ~125GB RAM was being used >> 2 Instance (with UKSM): ~81 GB RAM was being used >> >> We can see that around 44 GB of RAM was saved after UKSM merged similar >> pages and all this for a compromise of 1 CPU core on a 48 core machine. We >> did not feel any noticeable degradation of performance because the data is >> refreshed by a batch job only once (every morning); UKSM gets in at this >> time and performs the same page merging and for the rest of day, its just >> read only analysis. The kind of queries we fire on the dataset at most scans >> 2-3GB of the entire dataset and hence the query subset spike was low as >> well. >> >> We're interested in knowing if this is a plausible solution to this problem? >> Any other points/solutions that we should be considering? >> >> -- >> Thanks, >> M. Varadharajan >> >> ------------------------------------------------ >> >> "Experience is what you get when you didn't get what you wanted" >> -By Prof. Randy Pausch in "The Last Lecture" >> >> My Journal :- www.thinkasgeek.wordpress.com > > > > -- > Thanks, > M. Varadharajan > > ------------------------------------------------ > > "Experience is what you get when you didn't get what you wanted" > -By Prof. Randy Pausch in "The Last Lecture" > > My Journal :- www.thinkasgeek.wordpress.com > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel