>>>>> Zilefac Elvis <[email protected]>
>>>>> on Mon, 12 May 2014 15:01:49 -0700 writes:
> Hi,
> I will like to free up memory in R and make my program execute faster. At
the moment, MY PROGRAM IS VERY SLOW probably due to memory issues. Here is
sample data (Rcode is found at the end) from one simulation(I have 1000 such
files to process):
> list(c("1971 1 1GGG1 0.00 -3.68 -0.29", "1971 1 1GGG2 0.00 -8.31
0.81",
> "1971 1 1GGG3 0.00-10.69 5.69", "1971 1 1GGG4 1.78 -6.96 -2.20",
> "1971 1 1GGG5 2.64 -9.48 9.20", "1971 1 1GGG6 0.00 -9.74 3.73",
> "1971 1 1GGG7 0.00 -8.49 3.58", "1971 1 1GGG8 0.00 -2.78 -2.92",
> "1971 1 1GGG9 0.00 -9.30 0.63", "1971 1 1GG10 4.87 -5.59 3.11",
> "1971 1 1GG11 0.10-12.04 10.80", "1971 1 1GG12 0.00 -5.24 -0.43",
> "1971 1 1GG13 0.00 -8.82 2.88", "1971 1 1GG14 0.00-11.10 14.50",
> "1971 1 1GG15 0.00 -5.54 10.12", "1971 1 1GG16 0.00 -4.54 10.48",
> "1971 1 1GG17 0.00 1.68 17.28", "1971 1 1GG18 0.00 -5.79 6.64",
> "1971 1 1GG19 0.00 -5.27 14.29", "1971 1 1GG20 0.00 -8.93 9.60",
> "1971 1 1GG21 5.29 1.30 15.62", "1971 1 1GG22 0.00 -2.50 19.20",
> "1971 1 1GG23 0.00 -7.04 15.73", "1971 1 1GG24 0.00 -8.53 11.60",
> "1971 1 1GG25 0.00 -0.82 10.33", "1971 1 1GG26 0.00 -6.28 21.58",
[.............]
> ))
> Here is a code for processing a thousand of these kind of files:
>
#===================================================================================================================
> lst1Sub <- data_above
> lst2 <- lapply(lst1Sub,function(x) {dateSite <-
gsub("(.*G.{3}).*","\\1",x);
> dat1 <-
data.frame(Year=as.numeric(substr(dateSite,1,4)),
Month=as.numeric(substr(dateSite,5,6)),Day=as.numeric(substr(dateSite,7,8)),Site=substr(dateSite,9,12),stringsAsFactors=FALSE);
> Sims <-
str_trim(gsub(".*G.{3}\\s?(.*)","\\1",x));Sims[grep("\\d+-",Sims)] <-
gsub("(.*)([- ][0-9]+\\.[0-9]+)","\\1 \\2",gsub("^([0-9]+\\.[0-9]+)(.*)","\\1
\\2", Sims[grep("\\d+-",Sims)]));
> Sims1 <-
read.table(text=Sims,header=FALSE); names(Sims1) <- c("Precipitation", "Tmin",
"Tmax");dat2 <- cbind(dat1,Sims1)})
>
#=========================================================================================================================
> 1) Please use this code to free up memory considering that I am working
on 1000 files, so I am in for speed.
> 2) Is there a faster way of doing the same task as above? My data files
are simulated in FORTRAN and read as:
> data.frame(Day=as.numeric(substr(rain.data,7,8)),
> Month=as.numeric(substr(rain.data,5,6)),
> Year=as.numeric(substr(rain.data,1,4)),
> Site=substr(rain.data,9,12),
> Precip=as.numeric(substr(rain.data,13,18)),
> Tmin=as.numeric(substr(rain.data,19,24)),
> Tmax=as.numeric(substr(rain.data,25,30)))
> #### Day occupies position 7 and 8, Month occupies position 5 and 6, Year
occupies position 1 to 4 and so on for site, precip, ####tmin and tmax
Given that your data file has things "at position <n>",
I think you should use
read.fwf() instead of read.table()
and then you might not need all the string manipulations you do
above.
Martin
> Thanks for your great help.
> Zilefac.
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.