[R] anova or liklihood ratio test from biglm output
(Sorry if this is a repost, I got a bounce reply from the r-help server) Hi, Im using the biglm() function to create some linear models for a very large data set than lm() cant fit due to memory issues (the problem is with the number of interactions, I can fit the main effects model) I need to determine if the 2-way interactions are necessary or not. Ideally Id like to use anova() to get an anova table and a p-value for the interactions, however it appears that anova is not supported for biglm objects. So my next idea was to compare the main effects model with the 2-way interaction model using a likelihood ratio test. I seem to be able to get the deviance and residual DF from a biglm object, so I think I should be able to calculate the LRT and get my p-value if I assume a chi-squared distribution. I was wondering if anyone sees any problems with this approach (or would be kind enough to confirm it)? Or has any better suggestions, ideas or comments? Thankyou Chris Howden B.Sc. (Hons) GStat. Founding Partner Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax) +612 4782 9023 ch...@trickysolutions.com.au [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can R handle a matrix with 8 billion entries?
Thanks Corey, Ive looked into them before and I dont think they can help me with this problem. The Big functions are great for handling and analysing data sets that are too big for R to store in memory. However I believe my problem goes 1 step beyond that. In that my distance matrix has too many entries for Rs architecture to know how to store in memory, even if I had memory that was big enough to store it. Again, Im no expert in this so I may be wrong. Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) ch...@trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information. If you are not the named or intended recipient, please delete this communication and contact us immediately. Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement. *From:* Corey Dow-Hygelund [mailto:godelsthe...@gmail.com] *Sent:* Thursday, 11 August 2011 3:00 AM *To:* Chris Howden *Cc:* r-help@r-project.org *Subject:* Re: [R] Can R handle a matrix with 8 billion entries? You might want to look into the packages bigmemory and biganalytics. Corey On Tue, Aug 9, 2011 at 8:38 PM, Chris Howden wrote: Hi, Im trying to do a hierarchical cluster analysis in R with a Big Data set. Im running into problems using the dist() function. Ive been looking at a few threads about Rs memory and have read the memory limits section in R help. However Im no computer expert so Im hoping Ive misunderstood something and R can handle my Big Data set, somehow. Although at the moment I think my dataset is simply too big and there is no way around it, but Id like to be proved wrong! My data set has 90523 rows of data and 24 columns. My understanding is that this means the distance matrix has a min of 90523^2 elements which is 8194413529. Which roughly translates as 8GB of memory being required (if I assume each entry requires 1 bit). I only have 4GB on a 32bit build of windows and R. So there is no way thats going to work. So then I thought of getting access to a more powerful computer, and maybe using cloud computing. However the R memory limit help mentions On all builds of R, the maximum length (number of elements) of a vector is 2^31 - 1 ~ 2*10^9. Now as the distance matrix I require has more elements than this does this mean its too big for R no matter what I do? Any ideas would be welcome. Thanks. Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) ch...@trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information. If you are not the named or intended recipient, please delete this communication and contact us immediately. Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- *The mark of a successful man is one that has spent an entire day on the bank of a river without feeling guilty about it.* [[alternative HTML version deleted
[R] Can R handle a matrix with 8 billion entries?
Hi, I’m trying to do a hierarchical cluster analysis in R with a Big Data set. I’m running into problems using the dist() function. I’ve been looking at a few threads about R’s memory and have read the memory limits section in R help. However I’m no computer expert so I’m hoping I’ve misunderstood something and R can handle my Big Data set, somehow. Although at the moment I think my dataset is simply too big and there is no way around it, but I’d like to be proved wrong! My data set has 90523 rows of data and 24 columns. My understanding is that this means the distance matrix has a min of 90523^2 elements which is 8194413529. Which roughly translates as 8GB of memory being required (if I assume each entry requires 1 bit). I only have 4GB on a 32bit build of windows and R. So there is no way that’s going to work. So then I thought of getting access to a more powerful computer, and maybe using cloud computing. However the R memory limit help mentions “On all builds of R, the maximum length (number of elements) of a vector is 2^31 - 1 ~ 2*10^9”. Now as the distance matrix I require has more elements than this does this mean it’s too big for R no matter what I do? Any ideas would be welcome. Thanks. Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) ch...@trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information. If you are not the named or intended recipient, please delete this communication and contact us immediately. Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a better way to parse strings than this?
Thanks for the explanation, I think I understand it now. So to paraphrase all your explanations To match "." in a regular expression then the string "\.\.\." needs to be passed to it. This tells it to escape the special meaning of ".". But in order to get the \ into the string being passed to the function I also need to escape its special meaning, so I need to use "\\.\\.\\." Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au -Original Message- From: h.wick...@gmail.com [mailto:h.wick...@gmail.com] On Behalf Of Hadley Wickham Sent: Friday, 15 April 2011 11:07 AM To: Chris Howden Cc: r-help@r-project.org Subject: Re: [R] Is there a better way to parse strings than this? > I was trying strsplit(string,"\.\.\.") as per the suggestion in Venables > and Ripleys book to "(use '\.' to match '.')", which is in the Regular > expressions section. > > I noticed that in the suggestions sent to me people used: > strsplit(test,"\\.\\.\\.") > > > Could anyone please explain why I should have used "\\.\\.\\." rather than > "\.\.\."? Basically, * you want to match . * so the regular expression you need is \. * and the way you represent that in a string in R is \\. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a better way to parse strings than this?
Thanks for the suggestions, they were all exactly what I was looking for. (I knew that had to be a more elegant way then my brute force method) One question though. I was playing around with strsplit but couldn't get it to work, I realised my problem was that I was using "." as the string. I was trying strsplit(string,"\.\.\.") as per the suggestion in Venables and Ripleys book to "(use '\.' to match '.')", which is in the Regular expressions section. I noticed that in the suggestions sent to me people used: strsplit(test,"\\.\\.\\.") Could anyone please explain why I should have used "\\.\\.\\." rather than "\.\.\."? Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Wednesday, 13 April 2011 10:55 PM To: Chris Howden Cc: r-help@r-project.org Subject: Re: [R] Is there a better way to parse strings than this? On Wed, Apr 13, 2011 at 12:07 AM, Chris Howden wrote: > Hi Everyone, > > > I needed to parse some strings recently. > > The code I've wound up using seems rather clunky, and I was wondering if > anyone had any suggestions on a better way? > > Basically I do the following: > > 1) Use substr() to do the parsing > 2) Use regexpr() to find the location of the string I want to parse on, I > then pass this onto substr() > 3) Use nchar() as the stop input to substr() where necessary > > > > I've got a simple example of the parsing code I used below. It takes > questionnaire variable names that includes the question and the brand it > was answered for and then parses it so the variable name and the brand are > in separate columns. I then use this to restructure the data from > unstacked to stacked, but that's another story. > >> # this is the data set >> test > [1] "A5.Brands.bought...Dulux" > [2] "A5.Brands.bought...Haymes" > [3] "A5.Brands.bought...Solver" > [4] "A5.Brands.bought...Taubmans.or.Bristol" > [5] "A5.Brands.bought...Wattyl" > [6] "A5.Brands.bought...Other" > >> # Where do I want to parse? >> break1 <- regexpr('...',test, fixed=TRUE) >> break1 > [1] 17 17 17 17 17 17 > attr(,"match.length") > [1] 3 3 3 3 3 3 > >> # Put Variable name in a variable >> str1 <- substr(test,1,break1-1) >> str1 > [1] "A5.Brands.bought" "A5.Brands.bought" "A5.Brands.bought" > "A5.Brands.bought" > [5] "A5.Brands.bought" "A5.Brands.bought" > >> # Put Brand name in a variable >> str2 <- substr(test,break1+3, nchar(test)) >> str2 > [1] "Dulux" "Haymes" "Solver" > [4] "Taubmans.or.Bristol" "Wattyl" "Other" > > Try this: > x <- c("A5.Brands.bought...Dulux", "A5.Brands.bought...Haymes", + "A5.Brands.bought...Solver") > > do.call(rbind, strsplit(x, "...", fixed = TRUE)) [,1] [,2] [1,] "A5.Brands.bought" "Dulux" [2,] "A5.Brands.bought" "Haymes" [3,] "A5.Brands.bought" "Solver" > > # or > xa <- sub("...", "\1", x, fixed = TRUE) > read.table(textConnection(xa), sep = "\1", as.is = TRUE) V1 V2 1 A5.Brands.bought Dulux 2 A5.Brands.bought Haymes 3 A5.Brands.bought Solver -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is there a better way to parse strings than this?
Hi Everyone, I needed to parse some strings recently. The code I've wound up using seems rather clunky, and I was wondering if anyone had any suggestions on a better way? Basically I do the following: 1) Use substr() to do the parsing 2) Use regexpr() to find the location of the string I want to parse on, I then pass this onto substr() 3) Use nchar() as the stop input to substr() where necessary I've got a simple example of the parsing code I used below. It takes questionnaire variable names that includes the question and the brand it was answered for and then parses it so the variable name and the brand are in separate columns. I then use this to restructure the data from unstacked to stacked, but that's another story. > # this is the data set > test [1] "A5.Brands.bought...Dulux" [2] "A5.Brands.bought...Haymes" [3] "A5.Brands.bought...Solver" [4] "A5.Brands.bought...Taubmans.or.Bristol" [5] "A5.Brands.bought...Wattyl" [6] "A5.Brands.bought...Other" > # Where do I want to parse? > break1 <- regexpr('...',test, fixed=TRUE) > break1 [1] 17 17 17 17 17 17 attr(,"match.length") [1] 3 3 3 3 3 3 > # Put Variable name in a variable > str1 <- substr(test,1,break1-1) > str1 [1] "A5.Brands.bought" "A5.Brands.bought" "A5.Brands.bought" "A5.Brands.bought" [5] "A5.Brands.bought" "A5.Brands.bought" > # Put Brand name in a variable > str2 <- substr(test,break1+3, nchar(test)) > str2 [1] "Dulux" "Haymes" "Solver" [4] "Taubmans.or.Bristol" "Wattyl" "Other" Thanks for any and all suggestions Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory issues
Hi Emmanuel, Try the following: 1) removing unnecessary programs from memory, this might give u a larger contiguous memory block for R 2) remove unnecessary data from R's memory, so many of the preceding data sets U no longer need can be removed. use the rm() command. U might need to run gc() after this to insure the new memory is available 3) make sure U've assigned as much memory to R as possible using memory.size() And make sure u have r's Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling, and Training (mobile) 0410 689 945 ch...@trickysolutions.com.au -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Emmanuel Bellity Sent: Monday, 17 January 2011 4:53 AM To: r-help@r-project.org Subject: [R] Memory issues Hi, I have read several threads about memory issues in R and I can't seem to find a solution to my problem. I am running a sort of LASSO regression on several subsets of a big dataset. For some subsets it works well, and for some bigger subsets it does not work, with errors of type "cannot allocate vector of size 1.6Gb". The error occurs at this line of the code: example <- cv.glmnet(x=bigmatrix, y=price, nfolds=3) It also depends on the number of variables that were included in "bigmatrix". I tried on R and R64 for both Mac and R for PC but recently went onto a faster virtual machine on Linux thinking I would avoid any memory issues. It was better but still had some limits, even though memory.limit indicates "Inf". Is there anyway to make this work or do I have to cut a few variables in the matrix or take a smaller subset of data ? I have read that R is looking for some contiguous bits of memory and that maybe I should pre-allocate the matrix ? Any idea ? Many thanks Emmanuel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] is there a way to update both packages if they occur in 2 libraries?
Thanks for the explanation Brian, I used the summary(packageStatus()) to have a look at what was available and in each library. And then deleted all libraries that came with R2.12.0 from my personal library. And everything now works. Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au -Original Message- From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk] Sent: Wednesday, 20 October 2010 10:11 PM To: Uwe Ligges Cc: Chris Howden; r-help Subject: Re: [R] is there a way to update both packages if they occur in 2 libraries? On Wed, 20 Oct 2010, Uwe Ligges wrote: > > > On 20.10.2010 13:59, Chris Howden wrote: >> Thanks Uwe, >> >> It may operate like that on most peoples machines, but either its not >> operating like that on mine. Or I have another problem :-( >> >> As u can see from my code below I've run update.packages(checkBuilt=TRUE) >> and my 'private' library is in my LibPaths()... >> >> However when I try to load the foreign package I get an error message >> telling me "package "foreign' was built before R 2.10.0: > > > Ah, I haven't read your original message carefully enough: Package foreign is > a base package. Base packages should only be in the R base library, not in > any other library. They cannot be updated via update.packages(). Not quite (and it is a recommended not a base package). They can be updated *if updates are available*. update.packages(checkBuilt=TRUE) cannot update packages that are not currently on the selected repositories. This includes For Windows binaries, the recommended packages (which you have anyway in .Library) until later versions than those in 2.12.0 are available. Any packages which have been withdrawn. Any packages for which binaries are not available for R 2.12.x (and there are few, see http://cran.r-project.org/bin/windows/contrib/2.12/ReadMe and those with ERROR on http://cran.r-project.org/bin/windows/contrib/checkSummaryWin.html). A useful check is to run summary(packageStatus()) which reports packages which are unavailable in each library, directly via (for the jth library) summary(packageStatus())$Libs[[j]]$unavailable > > Best wishes, > Uwe > > >> please re-install >> it". But then if I remove my private library from the search path I can >> load foreignso this suggests the problem is with the foreign package >> in my 'private library'. >> >> Furthermore, if I look at the description file for foreign it claims to >> have been built for R package 2.9.2. (I've copied it below). >> >> I'm concluding the issue is with the foreign package in my private library >> since it claims to have been built for R 2.9.2& I can get the package to >> load if I remove my private library from the library search path and laod >> the foreaign package from the base library. >> I'm then concluding the problem is due to it not updating since >> the description file claims it was built for R version 2.9.2 and due to >> the error message I'm getting ie "foreign' was built before R >> 2.10.0: please re-install it" >> >> >> BUT I'm happy to be proven wrong... I just can't think of what else the >> problem could be? >> >> >> >> FOREIGN DESCRIPTION FILE >> Package: foreign >> Priority: recommended >> Version: 0.8-39 >> Date: 2010-01-03 >> Title: Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, dBase, >> ... >> Depends: R (>= 2.6.0), stats >> Imports: methods, utils >> Maintainer: R-core >> Author: R-core members, Saikat DebRoy, Roger >> Bivand and others: see COPYRIGHTS file in >> the sources. >> Description: Functions for reading and writing data stored by >> statistical packages such as Minitab, S, SAS, SPSS, Stata, >> Systat, ..., and for reading and writing .dbf (dBase) files. >> LazyLoad: yes >> License: GPL (>= 2) >> BugReports: http://bugs.r-project.org >> Packaged: 2010-01-03 10:24:13 UTC; ripley >> Repository: CRAN >> Date/Publication: 2010-01-03 14:06:04 >> Built: R 2.9.2; i386-pc-mingw32; 2010-01-03 23:21:40 UTC; windows >> >> >> >> >> >> >> Chris Howden >> Founding Partner >> Tricky Solutions >> Tricky Solutions 4 Tricky Problems >> Evidence Based Strategic Development, IP development, Data Analysis, >> Modelling, and Traini
Re: [R] is there a way to update both packages if they occur in 2 libraries?
Thanks Uwe, I was wondering if it was something like that. I'll delete the base packages from my personal library. And just as a comment...although I'm a rather new user to R (as U may have guessed). I gather that every now and then popular and necessary packages are added to base R. So I'm guessing the problem I was having would occur when ever this happens and people have the old package in their personal libraries. (which would likely be the case if it's considered good enough to add to base R) Not really a 'bug' of R. But something I'll remember!!! Thanks again. Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Wednesday, 20 October 2010 9:38 PM To: Chris Howden Cc: r-help Subject: Re: [R] is there a way to update both packages if they occur in 2 libraries? On 20.10.2010 13:59, Chris Howden wrote: > Thanks Uwe, > > It may operate like that on most peoples machines, but either its not > operating like that on mine. Or I have another problem :-( > > As u can see from my code below I've run update.packages(checkBuilt=TRUE) > and my 'private' library is in my LibPaths()... > > However when I try to load the foreign package I get an error message > telling me "package "foreign' was built before R 2.10.0: Ah, I haven't read your original message carefully enough: Package foreign is a base package. Base packages should only be in the R base library, not in any other library. They cannot be updated via update.packages(). Best wishes, Uwe > please re-install > it". But then if I remove my private library from the search path I can > load foreignso this suggests the problem is with the foreign package > in my 'private library'. > > Furthermore, if I look at the description file for foreign it claims to > have been built for R package 2.9.2. (I've copied it below). > > I'm concluding the issue is with the foreign package in my private library > since it claims to have been built for R 2.9.2& I can get the package to > load if I remove my private library from the library search path and laod > the foreaign package from the base library. > I'm then concluding the problem is due to it not updating since > the description file claims it was built for R version 2.9.2 and due to > the error message I'm getting ie "foreign' was built before R > 2.10.0: please re-install it" > > > BUT I'm happy to be proven wrong... I just can't think of what else the > problem could be? > > > > FOREIGN DESCRIPTION FILE > Package: foreign > Priority: recommended > Version: 0.8-39 > Date: 2010-01-03 > Title: Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, dBase, > ... > Depends: R (>= 2.6.0), stats > Imports: methods, utils > Maintainer: R-core > Author: R-core members, Saikat DebRoy, Roger > Bivand and others: see COPYRIGHTS file in > the sources. > Description: Functions for reading and writing data stored by > statistical packages such as Minitab, S, SAS, SPSS, Stata, > Systat, ..., and for reading and writing .dbf (dBase) files. > LazyLoad: yes > License: GPL (>= 2) > BugReports: http://bugs.r-project.org > Packaged: 2010-01-03 10:24:13 UTC; ripley > Repository: CRAN > Date/Publication: 2010-01-03 14:06:04 > Built: R 2.9.2; i386-pc-mingw32; 2010-01-03 23:21:40 UTC; windows > > > > > > > Chris Howden > Founding Partner > Tricky Solutions > Tricky Solutions 4 Tricky Problems > Evidence Based Strategic Development, IP development, Data Analysis, > Modelling, and Training > (mobile) 0410 689 945 > (fax / office) (+618) 8952 7878 > ch...@trickysolutions.com.au > > > -Original Message- > From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] > Sent: Wednesday, 20 October 2010 9:15 PM > To: Chris Howden > Cc: r-help > Subject: Re: [R] is there a way to update both packages if they occur in 2 > libraries? > > update.packages() updates all packages in all libraries listed in > .libPaths() unless you specify an explicit library. > > It may happen that the version number has not changed and you just want > to reinstall for your upgraded R. In that case use: > > update.packages(checkBuilt=TRUE) > > Best, > Uwe Ligges > > > > On 20.10.2010 04:07, Chris Howden wrote: >> Hi everyone, >> >> >> >> I
Re: [R] is there a way to update both packages if they occur in 2 libraries?
Thanks Uwe, It may operate like that on most peoples machines, but either its not operating like that on mine. Or I have another problem :-( As u can see from my code below I've run update.packages(checkBuilt=TRUE) and my 'private' library is in my LibPaths()... However when I try to load the foreign package I get an error message telling me "package "foreign' was built before R 2.10.0: please re-install it". But then if I remove my private library from the search path I can load foreignso this suggests the problem is with the foreign package in my 'private library'. Furthermore, if I look at the description file for foreign it claims to have been built for R package 2.9.2. (I've copied it below). I'm concluding the issue is with the foreign package in my private library since it claims to have been built for R 2.9.2 & I can get the package to load if I remove my private library from the library search path and laod the foreaign package from the base library. I'm then concluding the problem is due to it not updating since the description fileclaims it was built for R version 2.9.2 and due to the error message I'm getting ie"foreign' was built before R 2.10.0: please re-install it" BUT I'm happy to be proven wrong... I just can't think of what else the problem could be? FOREIGN DESCRIPTION FILE Package: foreign Priority: recommended Version: 0.8-39 Date: 2010-01-03 Title: Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, dBase, ... Depends: R (>= 2.6.0), stats Imports: methods, utils Maintainer: R-core Author: R-core members, Saikat DebRoy , Roger Bivand and others: see COPYRIGHTS file in the sources. Description: Functions for reading and writing data stored by statistical packages such as Minitab, S, SAS, SPSS, Stata, Systat, ..., and for reading and writing .dbf (dBase) files. LazyLoad: yes License: GPL (>= 2) BugReports: http://bugs.r-project.org Packaged: 2010-01-03 10:24:13 UTC; ripley Repository: CRAN Date/Publication: 2010-01-03 14:06:04 Built: R 2.9.2; i386-pc-mingw32; 2010-01-03 23:21:40 UTC; windows Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Wednesday, 20 October 2010 9:15 PM To: Chris Howden Cc: r-help Subject: Re: [R] is there a way to update both packages if they occur in 2 libraries? update.packages() updates all packages in all libraries listed in .libPaths() unless you specify an explicit library. It may happen that the version number has not changed and you just want to reinstall for your upgraded R. In that case use: update.packages(checkBuilt=TRUE) Best, Uwe Ligges On 20.10.2010 04:07, Chris Howden wrote: > Hi everyone, > > > > I've recently added a private library as a way to manage my R libraries. And > I did this by simply copying my old library to a new folder and then linking > this to R by setting my R_LIBS environmental variable in .Renviron. > > > > However I have run into a problem. > > > > When I update my packages it is not updating those that are current in the > base R library. > > > > This means I can't load packages that are included in base R, since R is > looking in my private library first and when it finds the package it tries > to use it. But it's an outdated version. > > > > The easiest solution I can think of is to update both libraries, but when I > run update.packages(lib.loc="private library location" ask = FALSE, > checkBuilt=TRUE) it's not updating them. > > > > So I was wondering if there is a way to update all packages that occur in > all libraries? > > > > > > (Note that I can think of other solutions to my problem, but they are all > time consuming and defeats the purpose of why I want a private library i.e. > it makes updating R easier since I don't need to copy over the library > folder each time nor update any environmental variables. So far the best > alternative I've come up with is to delete all the duplicate base R > libraries from my private library) > > > > If anyone is interested the code I used to understand my problem is below. > > > > > > Thanks everyone > > > > > > > >> update.packages(lib.loc="C:\\Program Files\\R\\library", ask = FALSE, > checkBuilt=TRUE) > > --- Please select a CRAN mirror for use in this session --- > > > >> update.packages(ask = FALSE, checkBuilt = TRUE) >
[R] is there a way to update both packages if they occur in 2 libraries?
Hi everyone, Ive recently added a private library as a way to manage my R libraries. And I did this by simply copying my old library to a new folder and then linking this to R by setting my R_LIBS environmental variable in .Renviron. However I have run into a problem. When I update my packages it is not updating those that are current in the base R library. This means I cant load packages that are included in base R, since R is looking in my private library first and when it finds the package it tries to use it. But its an outdated version. The easiest solution I can think of is to update both libraries, but when I run update.packages(lib.loc=private library location ask = FALSE, checkBuilt=TRUE) its not updating them. So I was wondering if there is a way to update all packages that occur in all libraries? (Note that I can think of other solutions to my problem, but they are all time consuming and defeats the purpose of why I want a private library i.e. it makes updating R easier since I dont need to copy over the library folder each time nor update any environmental variables. So far the best alternative Ive come up with is to delete all the duplicate base R libraries from my private library) If anyone is interested the code I used to understand my problem is below. Thanks everyone > update.packages(lib.loc="C:\\Program Files\\R\\library", ask = FALSE, checkBuilt=TRUE) --- Please select a CRAN mirror for use in this session --- > update.packages(ask = FALSE, checkBuilt = TRUE) Foreign package wont load > library(foreign) Error: package 'foreign' was built before R 2.10.0: please re-install it > .libPaths() [1] "C:\\Program Files\\R\\library" "C:/PROGRA~1/R/R-212~1.0/library" > .libPaths("new") > .libPaths() [1] "C:/PROGRA~1/R/R-212~1.0/library" Foreign package will load > library(foreign) Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can't find and install reshape2??
Thanks for the ideas, Just wanted to say that it was because I was using an old version of R (as U suggested). I have now updated to v12.0 and I can see and load reshape2. (and I agree with Hadley that it would be nice if there was some way of getting a more informative error message. However thanks to the helpful R community I know now what to do if I have a similar problem in the future) Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bernardo Rangel Tura Sent: Tuesday, 12 October 2010 6:53 PM To: r-help Subject: Re: [R] can't find and install reshape2?? On Mon, 2010-10-04 at 10:27 +0930, Chris Howden wrote: > Hi everyone, > > > > Im trying to install reshape2. > > > > But when I click on install package its not coming up!?!?! Im getting > reshape, but no reshape2? > > > > Ive also tried download.packages(reshape2, destdir="c:\\") & > download.packages(Reshape2, destdir="c:\\")but no luck!!! > > > > Does anyone have any ideas what could be going on? > > > > Chris Howden > Hi Chris, I have two guess: 1- You don't have installed 'stringr' pakage 2- Your R is outdated Try this two things and after this mail me -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging and working with BIG data sets. Is sqldf the best way??
Thanks for the advice Gabor, I was indeed not starting and finishing with sqldf(). Which was why it was not working for me. Please forgive a blatantly obvious mistake. I have tried what U suggested and unfortunately R is still having problems doing the join. The problem seems to be one of memory since I am receiving the below error message when I run the natural join using sqldf. Error: cannot allocate vector of size 250.0 Mb Timing stopped at: 32.8 0.48 34.79 I have tried it on a subset of the data and it works. So I think it's a memory issue, being caused by my very large dataset (11 million rows and 2 columns). I think I may have to admit that R cannot do this (on my machine). And try doing it in a full blown database such as postgre. Unless U (or anyone else) have any other suggestions??? Thanks again for your help. For anyone who's interested here's all my code and R log. ## # Info on input data ## > class(A) [1] "data.frame" > class(B) [1] "data.frame" > names(A) [1] "POINTID" "alistair" > names(B) [1] "POINTID""alistair_range" > dim(A) [1] 110485922 > dim(B) [1] 110485922 ## # Tried the join with an index on the entire data set ## > sqldf() > system.time(sqldf("create index ai1 on A(POINTID, alistair)")) user system elapsed 76.850.34 79.67 > system.time(sqldf("create index ai2 on B(POINTID, alistair_range)")) user system elapsed 75.430.43 77.16 > system.time(sqldf("select * from main.A natural join main.B")) Error: cannot allocate vector of size 250.0 Mb Timing stopped at: 32.8 0.48 34.79 > sqldf() Error in sqliteCloseConnection(conn, ...) : RS-DBI driver: (close the pending result sets before closing this connection) ## # Also tried the join with an index built from only the variable I intend to merge on, since I wasn't exactly sure which index was correct. ## > sqldf() > system.time(sqldf("create index ai1 on A(POINTID)")) user system elapsed 66.670.44 69.28 > system.time(sqldf("create index ai2 on B(POINTID)")) user system elapsed 68.180.31 68.73 > system.time(sqldf("select * from main.A natural join main.B")) Error: cannot allocate vector of size 31.2 Mb Timing stopped at: 10.56 0.04 10.87 > sqldf() Error in sqliteCloseConnection(conn, ...) : RS-DBI driver: (close the pending result sets before closing this connection) ## # and some memory info ## > memory.size() [1] 412.6 > memory.size(NA) [1] 4095 Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Friday, 15 October 2010 1:03 PM To: Chris Howden Cc: r-help@r-project.org Subject: Re: [R] merging and working with BIG data sets. Is sqldf the best way?? On Thu, Oct 14, 2010 at 10:56 PM, Chris Howden wrote: > Thanks for the suggestion and code Gabor, > > I've tried creating 2 indices: > > 1) just for the variable I intend to merge on > 2) on the entire data set I am merging (which I think is the one I should > be using??) > > However neither seemed to work. The first was still going after 2 hours, > and the second after 12 hours, so I stopped the join. > > If it's not too much bother I was wondering if U could let me know which > index I should be using? > > > Or in other words since I plan to merge using POINTID do I create an index > on > > system.time(sqldf("create index ai1 on A(POINTID)")) > system.time(sqldf("create index ai2 on B(POINTID)")) > > or > > system.time(sqldf("create index ai1 on A(POINTID,alistair)")) > system.time(sqldf("create index ai2 on B(POINTID, alistair_range)") > > > > I'm now using the following join statement > system.time(data2 <- sqldf("select * from A natural join B")) > If you only ran the three sqldf statements you mentioned in your post (thereby omitting 2 of the 5 sqldf calls in example 4i): sqldf("create...") sqldf("create...") sqldf("select...") then what you are doing is to create a database, upload your data to it, create an index on it, destroy the database, then create a second database, upload the data to this second database, create an second index and then destroy that database too and then finally create a third database, upload the data to it and then do a join without any indexes. You must bracket all this with empty sqldf calls as shown in 4i to force persistence: sqldf() sqldf("create...") s
Re: [R] merging and working with BIG data sets. Is sqldf the best way??
Thanks for the suggestion and code Gabor, I've tried creating 2 indices: 1) just for the variable I intend to merge on 2) on the entire data set I am merging (which I think is the one I should be using??) However neither seemed to work. The first was still going after 2 hours, and the second after 12 hours, so I stopped the join. If it's not too much bother I was wondering if U could let me know which index I should be using? Or in other words since I plan to merge using POINTID do I create an index on system.time(sqldf("create index ai1 on A(POINTID)")) system.time(sqldf("create index ai2 on B(POINTID)")) or system.time(sqldf("create index ai1 on A(POINTID,alistair)")) system.time(sqldf("create index ai2 on B(POINTID, alistair_range)") I'm now using the following join statement system.time(data2 <- sqldf("select * from A natural join B")) thanks Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Thursday, 14 October 2010 9:02 AM To: Chris Howden Cc: r-help@r-project.org Subject: Re: [R] merging and working with BIG data sets. Is sqldf the best way?? On Tue, Oct 12, 2010 at 2:39 AM, Chris Howden wrote: > I’m working with some very big datasets (each dataset has 11 million rows > and 2 columns). My first step is to merge all my individual data sets > together (I have about 20) > > I’m using the following command from sqldf > > data1 <- sqldf("select A.*, B.* from A inner join B > using(ID)") > > But it’s taking A VERY VERY LONG TIME to merge just 2 of the datasets (well > over 2 hours, possibly longer since it’s still going). You need to add indexes to your tables. See example 4i on the sqldf home page http://sqldf.googlecode.com This can result in huge speedups for large tables. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merging and working with BIG data sets. Is sqldf the best way??
Hi everyone, Im working with some very big datasets (each dataset has 11 million rows and 2 columns). My first step is to merge all my individual data sets together (I have about 20) Im using the following command from sqldf data1 <- sqldf("select A.*, B.* from A inner join B using(ID)") But its taking A VERY VERY LONG TIME to merge just 2 of the datasets (well over 2 hours, possibly longer since its still going). I was wondering if anyone could suggest a better way, or maybe some suggestions on how I could tweak my computer set up to speed it up? Ive looked at the following packages and this is the only way Ive found to actually merge large data sets in R. These packages seem great for accessing large data sets by avoiding storing them in RAM .but I cant see how they can be used to merge data sets together: ·ff ·filehash ·bigmemory Does anyone have any ideas? At the moment my best idea is to hand it over to someone with a dedicated database server and get them to do the merges (and then hope package biglm can do the modelling) Thanks for any ideas at all!! Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can't find and install reshape2??
Just wanted to say that I've gone onto the CRAN website and downloaded it directly from there. So its no longer a problem for me. But it may be one for other people, it is kinda weird I couldn't see it on the list of packages on 4 mirrors!! Thanks for your help though. -Original Message----- From: chris howden [mailto:tall.chr...@yahoo.com.au] Sent: Tuesday, 12 October 2010 3:48 PM To: 'Jeffrey Spies'; 'David Winsemius' Cc: 'r-help@r-project.org' Subject: RE: [R] can't find and install reshape2?? Hi Guys, Thanks for your suggestions and sorry for the delay in replying, I've been having one of those weeks. I feel a little silly not trying the package name input as a character string, I should have know that. However I have tried your suggestions and neither worked. The code and error messages are at the bottom of this email and U can see the reason would appear the "reshape2" package is not available on the repositories I'm trying to access. I then tried closing R, reopening it and looking in the following CRAN mirrors: Australia UK(London) Canada(BC) USA(AZ) Reshape 2 was in none of them, my choices were: ResearchMethods Reshape ResistorArray But no reshape2 Any ideas as to why I can't see reshape2? Is it just me or are other people having this problem? thanks > download.packages('reshape2', destdir="c:\\") Warning in download.packages("reshape2", destdir = "c:\\") : no package 'reshape2' at the repositories [,1] [,2] > install.packages('reshape2') Warning message: In getDependencies(pkgs, dependencies, available, lib) : package reshape2 is not available -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jeffrey Spies Sent: Monday, 4 October 2010 10:57 AM To: Chris Howden Cc: r-help@r-project.org Subject: Re: [R] can't find and install reshape2?? The first argument in download.packages should be of type character or a vector of characters. This worked for me: install.packages('reshape2') as did: download.packages('reshape2', '~/Downloads/') Cheers, Jeff. On Sun, Oct 3, 2010 at 8:57 PM, Chris Howden wrote: > Hi everyone, > > > > Im trying to install reshape2. > > > > But when I click on install package its not coming up!?!?! Im getting > reshape, but no reshape2? > > > > Ive also tried download.packages(reshape2, destdir="c:\\") & > download.packages(Reshape2, destdir="c:\\") but no luck!!! > > > > Does anyone have any ideas what could be going on? > > > > Chris Howden > > Founding Partner > > Tricky Solutions > > Tricky Solutions 4 Tricky Problems > > Evidence Based Strategic Development, IP development, Data Analysis, > Modelling, and Training > > (mobile) 0410 689 945 > > (fax / office) (+618) 8952 7878 > > ch...@trickysolutions.com.au > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can't find and install reshape2??
Hi Guys, Thanks for your suggestions and sorry for the delay in replying, I've been having one of those weeks. I feel a little silly not trying the package name input as a character string, I should have know that. However I have tried your suggestions and neither worked. The code and error messages are at the bottom of this email and U can see the reason would appear the "reshape2" package is not available on the repositories I'm trying to access. I then tried closing R, reopening it and looking in the following CRAN mirrors: Australia UK(London) Canada(BC) USA(AZ) Reshape 2 was in none of them, my choices were: ResearchMethods Reshape ResistorArray But no reshape2 Any ideas as to why I can't see reshape2? Is it just me or are other people having this problem? thanks > download.packages('reshape2', destdir="c:\\") Warning in download.packages("reshape2", destdir = "c:\\") : no package 'reshape2' at the repositories [,1] [,2] > install.packages('reshape2') Warning message: In getDependencies(pkgs, dependencies, available, lib) : package reshape2 is not available -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jeffrey Spies Sent: Monday, 4 October 2010 10:57 AM To: Chris Howden Cc: r-help@r-project.org Subject: Re: [R] can't find and install reshape2?? The first argument in download.packages should be of type character or a vector of characters. This worked for me: install.packages('reshape2') as did: download.packages('reshape2', '~/Downloads/') Cheers, Jeff. On Sun, Oct 3, 2010 at 8:57 PM, Chris Howden wrote: > Hi everyone, > > > > Im trying to install reshape2. > > > > But when I click on install package its not coming up!?!?! Im getting > reshape, but no reshape2? > > > > Ive also tried download.packages(reshape2, destdir="c:\\") & > download.packages(Reshape2, destdir="c:\\") but no luck!!! > > > > Does anyone have any ideas what could be going on? > > > > Chris Howden > > Founding Partner > > Tricky Solutions > > Tricky Solutions 4 Tricky Problems > > Evidence Based Strategic Development, IP development, Data Analysis, > Modelling, and Training > > (mobile) 0410 689 945 > > (fax / office) (+618) 8952 7878 > > ch...@trickysolutions.com.au > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory limit problem
Hi Daniel, There are a number of ways to deal with data without forcing them into RAM. If your comfortable with SQL the easiest way might be to use sqldf to join them using a SQL select query. Try googling "Handling large(r) datasets in R" Soren Hojsgaard. Or if u definitely only want to do a cbind and not a merge U might be able to use one of the following packages. These store the data on disk (rather than RAM) and might allow u to cbind them. Filehash Ff Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Daniel Nordlund Sent: Tuesday, 12 October 2010 3:00 PM To: r-help@r-project.org Subject: Re: [R] Memory limit problem > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of David Winsemius > Sent: Monday, October 11, 2010 10:07 PM > To: Tim Clark > Cc: r help r-help > Subject: Re: [R] Memory limit problem > > > On Oct 11, 2010, at 11:49 PM, Tim Clark wrote: > > > Dear List, > > > > I am trying to plot bathymetry contours around the Hawaiian Islands > > using the > > package rgdal and PBSmapping. I have run into a memory limit when > > trying to > > combine two fairly small objects using cbind(). I have increased > > the memory to > > 4GB, but am being told I can't allocate a vector of size 240 Kb. I > > am running R > > 2.11.1 on a Dell Optiplex 760 with Windows XP. I have pasted the > > error message > > and summaries of the objects below. Thanks for your help. Tim > > > > > >> xyz<-cbind(hi.to.utm,z=b.depth$z) > > Error: cannot allocate vector of size 240 Kb > > You have too much other "stuff". > Try this: > > getsizes <- function() {z <- sapply(ls(envir=globalenv()), > function(x) object.size(get(x))) > (tmp <- as.matrix(rev(sort(z))[1:10]))} > getsizes() > > You will see a list of the largest objects in descending order. Then > use rm() to clear out unneeded items. > > -- > David, > > > > >> memory.limit() > > [1] 4000 > > Seems unlikely that you really have that much space in that 32 bit OS. <<> Yeah, without performing some special incantations, Windows XP will not allocate more than 2GB of memory to any one process (e.g. R). And even with those special incantations, at most you will get no more than about 3.2-3.5 GB. The other thing to remember is that even if you had more than enough free space, R requires the free space for an object to be contiguous. So if memory was fragmented and you didn't have 240KB of contiguous memory, it still couldn't allocate the vector. Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] can't find and install reshape2??
Hi everyone, Im trying to install reshape2. But when I click on install package its not coming up!?!?! Im getting reshape, but no reshape2? Ive also tried download.packages(reshape2, destdir="c:\\") & download.packages(Reshape2, destdir="c:\\") but no luck!!! Does anyone have any ideas what could be going on? Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to replace NA with a specific score that is dependant on another indicator variable
Hi everyone, Im looking for a clever bit of code to replace NAs with a specific score depending on an indicator variable. I can see how to do it using lots of if statements but Im sure there most be a neater, better way of doing it. Any ideas at all will be much appreciated, Im dreading coding up all those if statements! My problem is as follows: I have a data set with lots of missing data: EG Raw Data Set Category variable1 variable2 variable3 15NA NA 1 NA 3 4 2NA 7NA etc Now I want to replace the NAs with the average for each category, so if these averages were: EG Averages Category variable1 variable2 variable3 1 4.5 3.2 2.5 2 3.5 7.4 5.9 So Id like my data set to look like the following once Ive replaced the NAs with the appropriate category average: EG Imputed Data Set Category variable1 variable2 variable3 153.2 2.5 1 4.5 3 4 2 3.5 7 5.9 etc Any ideas would be very much appreciated! thankyou Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to replace NA with a specific score that is dependant on another indicator variable
Hi everyone, Im looking for a clever bit of code to replace NAs with a specific score depending on an indicator variable. I can see how to do it using lots of if statements but Im sure there most be a neater, better way of doing it. Any ideas at all will be much appreciated, Im dreading coding up all those if statements! My problem is as follows: I have a data set with lots of missing data: EG Raw Data Set Category variable1 variable2 variable3 15NA NA 1 NA 3 4 2NA 7NA etc Now I want to replace the NAs with the average for each category, so if these averages were: EG Averages Category variable1 variable2 variable3 1 4.5 3.2 2.5 2 3.5 7.4 5.9 So Id like my data set to look like the following once Ive replaced the NAs with the appropriate category average: EG Imputed Data Set Category variable1 variable2 variable3 153.2 2.5 1 4.5 3 4 2 3.5 7 5.9 etc Any ideas would be very much appreciated! thankyou Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data from SpatialGridDataFrame
I'm not that familiar with this type of data. I just had a similar issue, but had a GIS person do it in Arc view. But maybe try some of the following functions? Match %in% Plus I'll forward U the replies I got to my post Good luck :-) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of kfl...@falw.vu.nl Sent: Tuesday, 20 July 2010 9:42 PM To: r-help@r-project.org Subject: [R] data from SpatialGridDataFrame Dear All, I have a raster map of the class 'SpatialPointsDataFrame' and coordinates of the class 'SpatialPoints'. I would like to retrieve the values that are contained in the raster map at the specific locations given by the coordinates. Can anyone help me out? Kind regards, Katrin Fleischer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] do the standard R analysis functions handle spatial "grid" data?
Hi everyone, I'm doing a resource function analysis with radio collared dingos and GIS info. The ecologist I'm working with wants to send me the data in a 'grid format'...straight out of ARCVIEW GIS. I want to model the data using a GLM and maybe a LOGISTIC model as well. And I was planning on using the glm and logistic functions in R. Now I'm pretty sure that these functions require the data to be in a 2-D spreadsheet format. And for me to call the responses and predictors as columns from a data.frame (or 2-D matrix) However I'm being told they can handle the data in a 'grid' format. So I'm pretty sure this would mean I would be calling the responses and predictors as 2-d matrices...and I don't think these functions can do that? Can anyone enlighten me? Am I right in thinking these function cannot handle data in a 3-D 'grid' format and require data to be entered as a 2-d data.frame or matrix? Are there other special functions out there that can handle this type of data, and I should be using these instead? Thanks for your help Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can't use function vcov with a GAMLSS object??
Hi everyone, I'm trying to use function vcov to extract the covariance matrix from a GAMLSS object. But I'm getting some strange errors and I was hoping someone could help me out? Vcov works with the same model for lm and glm objects, but not gamlss objects. I've searched various help sites to no avail. Its very possible the reason is that vcov failed though, since I got the following error message in the summary of the model "summary: vcov has failed, option qr is used instead" In which case I was wondering if anyone could help me out by explaining how I can find the covariance matrix equivalent without using vcov? The code and error messages I got for vcov are as follows. > vcov(paper_size_type_income) The following object(s) are masked _by_ .GlobalEnv : child id paper Error in gamlssNonLinear(family = RG, data = paper3, y = paper, mu.formula = paper ~ : NAs in y - use na.omit() I then try using na.omit and I get a different error message (even though there are no NA's in the data set, I checked using table(paper3$paper, exclude=NULL)) > temp <- gamlss(na.omit(paper)~size + type + income, family=RG, data=paper3) GAMLSS-RS iteration 1: Global Deviance = 1160.816 GAMLSS-RS iteration 2: Global Deviance = 1159.963 GAMLSS-RS iteration 3: Global Deviance = 1159.951 GAMLSS-RS iteration 4: Global Deviance = 1159.951 > vcov(temp) The following object(s) are masked _by_ .GlobalEnv : child id paper Error in inteprFormula.default(formula, .envir = envir, .start = start.v, : covariates in formulae with unknowns must not be factors check sizecovariates in formulae with unknowns must not be factors check typecovariates in formulae with unknowns must not be factors check income > thanks Chris Howden Marketing Scientist For all your Analysis, Modelling, Experimental Design and Training needs (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 tall.chr...@yahoo.com.au [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem reading in a csv file
Morning all, I'm trying to read in a csv file and R is having some problems. For some reason its not 'seeing' all the columns for each row, and as such is not reading in the file. I've opened the file in EXCEL and I can't see any problems with it. All rows have the correct number of columns. The code and error messages I've used are below. I've also run 'count'fields" and have included that output too. Any help or suggestions would be much appreciated. Thanks <-read.table("RWM Shopper Tracker - RAW DATA - 22JUN09 - Copy.csv", header=TRUE, sep =",", row.names=NULL) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 13 did not have 2264 elements > count.fields("RWM Shopper Tracker - RAW DATA - 22JUN09 - Copy.csv", sep=",") [1] 2264 2264 2264 2264 2264 2264 2264 2264 2264 2264 384 2264 2264 384 2264 2264 2264 2264 2264 2264 2264 2264 [23] 2264 2264 2264 2264 152 2264 384 2264 2264 2264 2264 2264 2264 2264 2264 2264 2264 2264 2264 2264 2264 2264 Chris Howden Marketing Scientist For all your Analysis, Modelling, Experimental Design and Training needs (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 tall.chr...@yahoo.com.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can anyone suggest some r packages for Experimental Designs, specifically for choice and conjoint??? (or is intersted in helping me make 1)
Afternoon everyone, I've spent the last week or so looking at all the experimental design packages I can find in R. AlgDesign, design.conf and BHH2 being the best one I could find. Unfortunately none of these do a particularly good job for complex designs, in particular for conjoint or discrete choice. (or perhaps they do, and I can't make them work correctly) Specifically, the problem is that none of them optimise the design for main effects or 2-way effects balance. So although the 'd-efficiency' is optimised some 2 way interactions are not present in the design (thereby preventing the interaction from being modelled). And the main effects balance is also quite bad, some levels being seen twice as often as others So I was wondering if anyone out there has any experience in using R for complex design issues, and if so if they could point me in the direction of some good packages? Or maybe help me out with my 'balance' problem? Thanks for your help PS: And if all else fails I'm thinking about trying to extend AlgDesign to incorporate balance as a criteria when searching for designs. So just wondering if there's anyone out there keen to help me (even if its just testing out my beta versions) Chris Howden Marketing Scientist For all your Analysis, Modelling, Experimental Design and Training needs (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 tall.chr...@yahoo.com.au [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can anyone suggest some r packages for Experimental Designs, specifically for choice and conjoint??? (or is intersted in helping me make 1)
Afternoon everyone, I've spent the last week or so looking at all the experimental design packages I can find in R. AlgDesign, design.conf and BHH2 being the best one I could find. Unfortunately none of these do a particularly good job for complex designs, in particular for conjoint or discrete choice. (or perhaps they do, and I can't make them work correctly) Specifically, the problem is that none of them optimise the design for main effects or 2-way effects balance. So although the 'd-efficiency' is optimised some 2 way interactions are not present in the design (thereby preventing the interaction from being modelled). And the main effects balance is also quite bad, some levels being seen twice as often as others So I was wondering if anyone out there has any experience in using R for complex design issues, and if so if they could point me in the direction of some good packages? Or maybe help me out with my 'balance' problem? Thanks for your help PS: And if all else fails I'm thinking about trying to extend AlgDesign to incorporate balance as a criteria when searching for designs. So just wondering if there's anyone out there keen to help me (even if its just testing out my beta versions) Chris Howden Marketing Scientist For all your Analysis, Modelling, Experimental Design and Training needs (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 tall.chr...@yahoo.com.au [[alternative HTML version deleted]] Send instant messages to your online friends http://au.messenger.yahoo.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.