See ?save . The ... arguments are the ***names*** of the objects, not the objects so you want save("d", ...whatever...) not save(d, ...whatever...) . Also don't use attach and detach and read this about factors which applies if your factor has many levels but can be ignored if not: http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg92970.html
On 8/21/07, Jessica Z <[EMAIL PROTECTED]> wrote: > Hello List, i have been agonizing over this for days, any reply would be > greatly appreciated! > > Situation:___________________________________ > My original dataset is a .csv dataset (w/ 2M records) with 4 variables: > job_id (Primary key, won't be used for analysis, just used for join tables), > sector_id (categorical variable, for 19 industry sectors), > sqft (con't variable for square footage), > building_type (categorical, for 2 building types) > some values of sqft were inputed wrong, so i'd like to set sqft<1 to "NA" > and then use aregImpute() to impute those NAs. > > Problem: the origianl dataset(.csv format) is too large. though i could read > that dataset into R, i could not get aregImpute() run even i set the memory > limit to 3G ! (yes, i did the switch in windows to reach 3G rather than 2G) > > Goal: try to find a way to slim down my dataset so as to get aregImpute() > running. > > What i did:________________________________ > i searched in the archive, and found someone said, as R tends to inflate > memory, it is a good idea to first read the original dataset into R--> then > save it as a more compact binary file using save() --> and then reload the > compact binary file back into R using load(). this way would reduce the > memory allocation. > > HOWEVER, after i saved my original dataset into a compact binary file using > save(), and used "load("filename.Rdata") to reload the new compact data > format into R, I could not figure out how to retrive all my variables!!! R > shows the new dataset is not a list, nor a matrix, or a dataframe, but just a > character with length 1 !!! and there is no way i could do attach(). > > i generated a 1K-row subset out of my original dataset to illustrate my > problem (does anyone know how to get my four variables back from this > "compact binary" new dataset? what did i do wrong?): > > > data <- read.table (file.choose(),header=T,sep=",") > > summary(data) > job_id sector_id sqft building_type > Min. : 1.0 Min. : 6.000 Min. : 0.00 Min. :1.000 > 1st Qu.: 250.8 1st Qu.: 6.000 1st Qu.: 3.00 1st Qu.:2.000 > Median : 500.5 Median :11.000 Median : 4.00 Median :2.000 > Mean : 500.5 Mean : 9.455 Mean : 12.49 Mean :1.996 > 3rd Qu.: 750.3 3rd Qu.:11.000 3rd Qu.: 4.00 3rd Qu.:2.000 > Max. :1000.0 Max. :12.000 Max. :192.00 Max. :2.000 > > > > attach(data) > > sqft[sqft<1] <- NA > > sector.f <- as.factor(sector_id) > > building_type.f <- as.factor (building_type) > > d <- data.frame(job_id,sector.f,sqft, building_type.f) > > summary (d) > job_id sector.f sqft building_type.f > Min. : 1.0 6 :340 Min. : 3.00 1: 4 > 1st Qu.: 250.8 11:505 1st Qu.: 4.00 2:996 > Median : 500.5 12:155 Median : 4.00 > Mean : 500.5 Mean : 14.16 > 3rd Qu.: 750.3 3rd Qu.: 17.00 > Max. :1000.0 Max. :192.00 > NA's :118.00 > > save (d, file="compact_d.Rdata", ascii=FALSE) > > > > newdata <- load ("compact_d.Rdata") > > > > summary(newdata) > Length Class Mode > 1 character character > > attach(newdata) > Error in attach(newdata) : file 'd' not found > > is.data.frame (newdata) > [1] FALSE > > is.list (newdata) > [1] FALSE > > is.matrix (newdata) > [1] FALSE > > > _________________________________ > btw, i also tried to just save (into compact binary) and reload (the new > compact binary data format) (as i could do the "NA" stuff in sql anyhow). > however, i still got stucked at the same spot: > > data <- read.table (file.choose(),header=T,sep=",") > > summary(data) > job_id sector_id sqft building_type > Min. : 1.0 Min. : 6.000 Min. : 0.00 Min. :1.000 > 1st Qu.: 250.8 1st Qu.: 6.000 1st Qu.: 3.00 1st Qu.:2.000 > Median : 500.5 Median :11.000 Median : 4.00 Median :2.000 > Mean : 500.5 Mean : 9.455 Mean : 12.49 Mean :1.996 > 3rd Qu.: 750.3 3rd Qu.:11.000 3rd Qu.: 4.00 3rd Qu.:2.000 > Max. :1000.0 Max. :12.000 Max. :192.00 Max. :2.000 > > save (data, file="compact_data.Rdata", ascii=FALSE) > > newdata <- load ("compact_data.Rdata") > > summary(newdata) > Length Class Mode > 1 character character > > attach(newdata) > Error: restore file may be empty -- no data loaded > In addition: Warning message: > file 'data' has magic number '' > Use of save versions prior to 2 is deprecated > > is.data.frame (newdata) > [1] FALSE > > is.list (newdata) > [1] FALSE > > is.matrix (newdata) > [1] FALSE > > > > > > > --------------------------------- > Building a website is a piece of cake. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.