Re: [R] Forward stepwise regression using lmStepAIC in Caret
Not sure I really like stepwise variable selection, but the function should work, of course. Compare data(BostonHousing, package = mlbench) lmFit - train(medv ~ ., data = BostonHousing, lmStepAIC, scope = list(lower = ~., upper = ~.^2), direction = forward) with lmFit - train(medv ~ ., data = BostonHousing, lmStepAIC, scope = list(lower = ~., upper = ~.^2), direction = backward) and also the version without the direction argument where the code tries to do the right thing. In summary, my understanding is that your call first fits a y ~ . type model from which you cannot step backwards with constraints and then it tries to do 'what you might have meant'. Hope this helps a little. Allan On 05/03/12 01:14, Dan Putka wrote: I'm looking for guidance on how to implement forward stepwise regression using lmStepAIC in Caret. The stepwise direction appears to default to backward. When I try to use scope to provide a lower and upper model, Caret still seems to default to backward. Any thoughts on how I can make this work? Here is what I tried: itemonly- susbstitute(~i1+i2+i3+i4+i5+i6+i7+i8+i9+i10) #this is my full model #I want my lower model to consist of the intercept only stepLmFit.i- train(xtraindata.i, ytraindata,lmStepAIC, scope=list(upper=itemonly,lower=~1),direction=forward) Any guidance on how I can make this work would be greatly appreciated. Dan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to terminate R program if found any execution error
The original poster may also be interested in options(error) to capture the 'any execution error' requirement. From the examples in help(options): ## Not run: ## on error, terminate the R session with error status 66 options(error = quote(q(no, status=66, runLast=FALSE))) stop(test it) Allan On 27/09/2011 06:22, Duncan Murdoch wrote: On 11-09-27 12:20 AM, arunkumar wrote: Hi I want to terminate R process if there are any execution error. a=a b=10 c=try(a/b) if(class(c)[1]==try-error) { stop(Wrong Input Value) } d=c*c if c fails then it should terminate the process. Please can anyone help Keep the try(a/b), but replace the conditional with if (inherits(c, try-error)) q(no) See ?q if you want to set an error status to be returned, or do want to save the workspace, etc. (And do use inherits() rather than comparing to a particular entry: your code will probably work in this example, but it's not the right way to test the class of something, and some day try-error might not be the first entry.) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does replacing some values of a zoo object by NA reduce it's size ?
It is not an anomaly. The object is the same size (object.size(f1)==object.size(f2)); its file representation is different. On 27/09/2011 07:00, Ashim Kapoor wrote: Dear R-helpers, Please have a look at the following. f1 is the same as f2 except that it has some values replaced by NA. But it's corresponding file is slightly bigger than the file containing f2. Could someone please tell me if this is an anomaly ? load(file1) ls() [1] f1 load(file2) ls() [1] f1 f2 dput(f1) structure(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, NA, NA, NA, 2, 2, 2, NA, NA, NA, NA), .Dim = c(10L, 2L), .Dimnames = list(NULL, c(v1, l1)), index = 1:10, class = zoo) dput(f2) structure(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), .Dim = c(10L, 2L), .Dimnames = list(NULL, c(v1, l1)), index = 1:10, class = zoo) -rw-r--r-- 1 ashimkapoor ashimkapoor 192 2011-09-27 11:08 file1 -rw-r--r-- 1 ashimkapoor ashimkapoor 179 2011-09-27 11:08 file2 Best Regards, Ashim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Disabling Auto-complete
The $ operator does partial matching by default so for example print(b$u) [1] 1 The [[ operator does not print(b[[unit]]) NULL print(b[[unit1]]) [1] 1 See help($) for more details and also the warnPartialMatch* arguments in help(options). Try testfunction - function(x) if (exists(unit, x) x[[unit]]==1) cat(All is well\n) Hope this helps Allan On 26/09/11 10:23, Vikram Bahure wrote: Hi, I am a new user to R. I am having the following problem while using R: The defined function is having following a$unit as input but if I define a$unit1 then still I am getting the output which is not desired. __ *Function:* testfunction-function(a){ stopifnot(a$unit==1) cat(All is well) } b-list(unit1=1 ) testfunction(b) - *Output:* testfunction-function(a){ stopifnot(a$unit==1) cat(All is well) } b-list(unit1=1 ) testfunction(b) testfunction-function(a){ + stopifnot(a$unit==1) + cat(All is well) + } b-list(unit1=1 ) testfunction(b) All is well __ If it is a auto complete problem, it would be nice if you could let me know how to disable it, else you could throw some light on the what the problem is exactly. Regards Vikram [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ignoring loadNamespace errors when loading a file
On a Unix machine I ran caret::rfe using the multicore package, and I saved the resulting object using save(lm2, file = lm2.RData). [Reproducible example below.] When I try to load(lm2.RData) on my Windows laptop, I get Error in loadNamespace(name) : there is no package called 'multicore' I completely understand the error and I would like to ignore it and still load the saved object data. I imagine that I can make myself an empty multicore package for Windows and load the data file successfully, but is there another way? (I am not going to use any multicore functionality; I just want to inspect some of the data stored in the object and the reference in question is just an unfortunately stored link from the original call.) (I did search for this question in the archives, but could only find it discussed in connection with starting R with a .RData file where the consensus seems to be to start R in vanilla mode and install the missing package. This situation is different and installing a Unix-only package on Windows is obviously a non-starter, except as I proposed above.) Obligatory reproducible example: On the Unix machine do library(multicore) a - list(data = 1:10, fun = mclapply) save(a, file = a.RData) and then try to load the a.RData file on Windows. The question is if I can recover the data (1:10) on that platform. Allan sessionInfo() R version 2.13.1 (2011-07-08) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United Kingdom.1252 [2] LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] compiler stats graphics grDevices utils datasets methods [8] base other attached packages: [1] caret_4.98 cluster_1.14.0 reshape_0.8.4 plyr_1.6 [5] lattice_0.19-31 boot_1.3-2 ctv_0.7-3 loaded via a namespace (and not attached): [1] grid_2.13.1 tools_2.13.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ignoring loadNamespace errors when loading a file
On 22/08/11 12:26, Martin Morgan wrote: On 08/21/2011 11:52 PM, Allan Engelhardt wrote: [...] Obligatory reproducible example: On the Unix machine do library(multicore) a - list(data = 1:10, fun = mclapply) save(a, file = a.RData) and then try to load the a.RData file on Windows. The question is if I can recover the data (1:10) on that platform. Is this a more realistic reproducible example (from ?rfe, modified to use computeFunction=mclapply)? data(BloodBrain) x - scale(bbbDescr[,-nearZeroVar(bbbDescr)]) x - x[, -findCorrelation(cor(x), .8)] x - as.data.frame(x) set.seed(1) lmProfile - rfe(x, logBBB, sizes = c(2:25, 30, 35, 40, 45, 50, 55, 60, 65), rfeControl = rfeControl(functions = lmFuncs, number = 5, computeFunction=mclapply)) Maybe provide a computeFunction that only indirectly references mclapply computeFunction=function(...) { if (require(multicore)) mclapply(...) else lapply(...) } [...] Yes, that workaround works for my usage. Thanks! Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reflecting a PCA biplot
Something like opar- par(mfcol = c(1, 2)) z- prcomp(USArrests, scale = TRUE) biplot(z, cex = 0.5) z$x[,1]- -z$x[,1] z$rotation[,1]- -z$rotation[,1] biplot(z, cex = 0.5, xlab = -PC1) par(opar) perhaps? Allan On 09/08/11 13:57, Andrew Halford wrote: Hi Listers, I am trying to reflect a PCA biplot in the x-axis (i.e. PC1) but am not having much success. In theory I believe all I need to do is multiply the site and species scores for the PC1 by -1, which would effectively flip the biplot. I am creating a blank plot using the plot command and accessing the results from a call to rda. I then use the calls to scores to obtain separate site and species coordinates and I have worked out how to multiply the appropriate PC1 scores by -1 to create the site and species scores I want. However I am not sure how to change the call to plot which accesses the results of the call to rda to draw the blank plot. The coordinates it is accessing are for the unreflected ordination and this does not match the new site and species scores that I have. fish.pca- rda(fish.hel) fish.site- scores(fish.pca,display=sites,scaling=3) fish.spp- scores(fish.pca,display=species,scaling=3) fish.site[,PC1]- -1*(fish.site[,PC1]) fish.spp[,PC1]- -1*(fish.spp[,PC1]) graph- plot(fish.pca,display=c(sites,species),type=n,scaling=3) # how do I get the plot to draw up the blank display based on the reversed site and species scores? Any help appreciated. cheers Andy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cdplot error
On 03/08/11 05:52, wildernessness wrote: Fairly new at this. Trying to create a conditional density plot. cdplot(status~harvd.l,data=phy) Error in cdplot.formula(status~harvd.l,data=phy): dependent variable should be a factor What does this error mean? Status is a binary response of infestation (0/1) Probably status is a numerical variable rather than a factor**. Try print(is.factor(phy$status)) and if that is FALSE then phy$status - factor(phy$status, labels=c(N, Y)) cdplot(status~harvd.l,data=phy) Hope this helps a little. Allan and harvd.l is the log of timber harvest density per catchment. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/cdplot-error-tp3714454p3714454.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deleting the last value of a vector
This is unlikely to be the kind of operation where speed is essential, but nevertheless on my build of 2.14.0 (with byte compiled base packages): stopifnot(getRversion()= 2.14) library(compiler) f1- function (x, n) head(x, length(x) - n)# suggested by baptiste auguie f2- function (x, n) x[ seq(1, length(x) - n) ]# suggested by baptiste auguie f3- function (x, n) x[ - seq(length(x), by=-1, length=n) ] # suggested by baptiste auguie f4- function (x, n) head(x, -n)# Suggested by Petr PIKAL f5- function (x, n) x[ seq_len(length(x) - n) ]# My variation on f2 c1- cmpfun(f1,options = list(optimize = 3))# Undocumented (?) default is 2 c2- cmpfun(f2,options = list(optimize = 3)) c3- cmpfun(f3,options = list(optimize = 3)) c4- cmpfun(f4,options = list(optimize = 3)) c5- cmpfun(f5,options = list(optimize = 3)) library(rbenchmark) set.seed(42) x- runif(1e6) n- 1 benchmark(f1(x, n), f2(x, n), f3(x, n), f4(x, n), f5(x, n), c1(x, n), c2(x, n), c3(x, n), c4(x, n), c5(x, n), columns = c(test, elapsed, relative), order = relative, replications = 1e3) #test elapsed relative # 7 c2(x, n) 17.483 1.00 # 10 c5(x, n) 17.600 1.006692 # 1 f1(x, n) 22.782 1.303094 # 2 f2(x, n) 23.054 1.318652 # 4 f4(x, n) 23.498 1.344049 # 9 c4(x, n) 23.547 1.346851 # 6 c1(x, n) 23.752 1.358577 # 5 f5(x, n) 23.892 1.366585 # 3 f3(x, n) 27.209 1.556312 # 8 c3(x, n) 27.296 1.561288 ## So c{2,5} are fastest, .3 slowest, and the rest much the same n- 1e5 benchmark(f1(x, n), f2(x, n), f3(x, n), f4(x, n), f5(x, n), c1(x, n), c2(x, n), c3(x, n), c4(x, n), c5(x, n), columns = c(test, elapsed, relative), order = relative, replications = 1e3) #test elapsed relative # 10 c5(x, n) 14.729 1.00 # 7 c2(x, n) 14.939 1.014258 # 5 f5(x, n) 18.544 1.259013 # 2 f2(x, n) 18.557 1.259895 # 6 c1(x, n) 18.695 1.269265 # 4 f4(x, n) 18.718 1.270826 # 9 c4(x, n) 18.729 1.271573 # 1 f1(x, n) 18.729 1.271573 # 3 f3(x, n) 20.836 1.414624 # 8 c3(x, n) 20.958 1.422907 ## As before n- 1 enableJIT(2)# Will also optimise rbenchmark::benchmark. benchmark(f1(x, n), f2(x, n), f3(x, n), f4(x, n), f5(x, n), c1(x, n), c2(x, n), c3(x, n), c4(x, n), c5(x, n), columns = c(test, elapsed, relative), order = relative, replications = 1e3) #test elapsed relative # 7 c2(x, n) 15.035 1.00 # 10 c5(x, n) 15.076 1.002727 # 2 f2(x, n) 15.258 1.014832 # 9 c4(x, n) 15.313 1.018490 # 5 f5(x, n) 15.327 1.019421 # 6 c1(x, n) 15.402 1.024410 # 4 f4(x, n) 15.430 1.026272 # 1 f1(x, n) 15.594 1.037180 # 3 f3(x, n) 18.238 1.213036 # 8 c3(x, n) 18.245 1.213502 ## No difference between the c. and f. functions here So as long as you don't use the .3 functions you should be OK. Allan http://www.cybaea.net/Blogs/Data/ On 18/04/11 07:07, Petr PIKAL wrote: Hi r-help-boun...@r-project.org napsal dne 18.04.2011 04:51:20: Or perhaps even more parsimoniously (by a couple of characters) - r- c(1, 2, 3, 4, 5) r2-r[-length(r)] Maybe even shorter head(x,-1) Regards Petr Min-Han On Sun, Apr 17, 2011 at 10:23 PM, Daisy Englert Duursma daisy.duur...@gmail.com wrote: A easier solution: r- c(1, 2, 3, 4, 5) r2-r[1:length(r)-1] On Mon, Apr 18, 2011 at 10:51 AM, empyreanctr...@ucdavis.edu wrote: Hey guys, I've search a few threads about deleting a value from a vector, but no one has addressed this question so far. I want to delete the last value from a string of values I have: r = [ 1, 2, 3, 4, 5 ], and i want to make r2 = to [ 1, 2, 3, 4] So that r2 is just like r, except that it missing the final value. Thanks, -- View this message in context: http://r.789695.n4.nabble.com/Deleting-the-last-value-of-a-vector- tp3456363p3456363.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Daisy Englert Duursma Department of Biological Sciences Room E8C156 Macquarie University, North Ryde, NSW 210 Australia Tel +61 2 9850 9256 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing
Re: [R] From nested loop to mclapply
Try help(expand.grid, package=base) for one way to create the combinations of (i,j) outside the loop, or perhaps vignette(nested, package=foreach) which does it automatically (rather: naturally). Allan On 18/04/11 16:53, Alaios wrote: Dear all, I am trying to find a decent way to speed up my code. So far I have used mclapply with really good results (parallel version of lapply). I have a nested loop that I would like to help me convert it to lapply for (i in seq(from=-1,to=1-2/ncol(sr),length=ncol(sr))){ for (j in seq(from=-1,to=1-2/nrow(sr),length=nrow(sr))){ estimatedsr[findCoord(c(i,j),sr)[1],findCoord(c(i,j),sr)[2] ]-fxy(c(i,j)) } So far I have converted some one-depth for loops to lapply but I am not sure If I can use lapply to convert a nested loop to something simpler. Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to remove the double or single quote from a string (unquote?)?
This should be a FAQ: you are confusing the value with its printed representation. Try print(paste(V, 3:8, sep=), quote=FALSE) to see that there are no quotes, and you may want to read up on help(as.formula, package=stats) which has the examples you are searching for. Allan On 18/04/11 17:38, JingJiang Yan wrote: Is there a function to get a string without a pair of quotes around it? I have several expressions like: glm(V12 ~ V3, family=binomial, data=df1) glm(V12 ~ V4, family=binomial, data=df1) ... glm(V12 ~ V8, family=binomial, data=df1) As you can see, the only differences among them are V3 ... V8. Because sometimes several of these expressions are performed many times, I want to use a variable i to change the V3 ... V8. I did this with: i - 3:8 glm(V12 ~ paste(V, i, sep=), family=binomial, data=df1) However, it seems the paste always returns a variable name with a pair of quotes, which were wrong in such condition. I only find a function sQuote to add quotes to a string, and it looks I am looking for an opposite function of it. Any advice will be appreciated. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R licence
The licences are available on the web site and you really should have yor lawyers look at them and give you professional advise. The GPL2+ is probably the relevant one for your purposes and essentially require you to provide the source for the parts of R that you distribute. However, the R licencing is a mess with core packages not on a GPL licence. We looked into it and it is a nightmare and I don't even think the UK CRAN mirrors are strictly speaking legal. So look carefully at what you use and get advice from somewhere other than a mailing list. happy to discuss the UK side of things offline if you want, but I never looked at other jurisdictions. Allan - Original message - Hi, is it possible to use some statistic computing by R in proprietary software? Our software is written in c#, and we intend to use http://rdotnet.codeplex.com/ to get R work there. Especially we want to use loess function. Thanks, Best regards, Stanislav    [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Popularity of R, SAS, SPSS, Stata, Statistica, S-PLUS updated
Not R, but just to get the data (format is month year,week,count) to compare with your students' output: perl -MLWP::UserAgent -e 'my $ua = LWP::UserAgent-new(); my $l = $ua-request(HTTP::Request-new(GET = qq{http://www.listserv.uga.edu/archives/sas-l.html}))-content(); while ( $l =~ m{href=\(/cgi-bin/wa\?A1=.*?)\(.*?), \s week \s (\d)/a}igxms ) { my ($m,$w) = ($2,$3); my $u = qq{http://www.listserv.uga.edu$1}; my $i = $ua-request(HTTP::Request-new(GET = qq{$u}))-content(); my $c = () = $i =~ m{href=/cgi-bin/wa\?A2}igxms; print qq{$m,$w,$c\n}; sleep(1); }' March 2011,4,122 March 2011,3,255 March 2011,2,312 March 2011,1,318 February 2011,4,243 February 2011,3,230 February 2011,2,289 February 2011,1,354 January 2011,5,93 January 2011,4,345 January 2011,3,329 January 2011,2,385 January 2011,1,297 December 2010,5,110 December 2010,4,205 December 2010,3,290 December 2010,2,359 December 2010,1,311 November 2010,5,91 November 2010,4,246 November 2010,3,227 November 2010,2,262 November 2010,1,228 October 2010,5,51 October 2010,4,242 October 2010,3,212 October 2010,2,250 October 2010,1,384 September 2010,5,101 September 2010,4,345 September 2010,3,214 September 2010,2,287 September 2010,1,133 August 2010,5,90 August 2010,4,314 August 2010,3,242 August 2010,2,202 August 2010,1,234 July 2010,5,95 July 2010,4,231 July 2010,3,354 July 2010,2,306 July 2010,1,176 June 2010,5,98 June 2010,4,244 June 2010,3,147 June 2010,2,229 June 2010,1,216 May 2010,5,25 May 2010,4,237 May 2010,3,328 May 2010,2,261 May 2010,1,341 April 2010,5,167 April 2010,4,291 April 2010,3,324 April 2010,2,273 April 2010,1,288 March 2010,5,217 March 2010,4,351 March 2010,3,437 March 2010,2,524 March 2010,1,456 February 2010,4,473 February 2010,3,348 February 2010,2,379 February 2010,1,347 January 2010,5,108 January 2010,4,482 January 2010,3,387 January 2010,2,424 January 2010,1,398 December 2009,5,88 December 2009,4,252 December 2009,3,443 December 2009,2,373 December 2009,1,514 November 2009,5,158 November 2009,4,318 November 2009,3,461 November 2009,2,383 November 2009,1,494 October 2009,5,186 October 2009,4,515 October 2009,3,532 October 2009,2,547 October 2009,1,410 September 2009,5,209 September 2009,4,457 September 2009,3,435 September 2009,2,371 September 2009,1,355 August 2009,5,87 August 2009,4,528 August 2009,3,436 August 2009,2,526 August 2009,1,412 July 2009,5,270 July 2009,4,423 July 2009,3,449 July 2009,2,380 July 2009,1,346 June 2009,5,156 June 2009,4,390 June 2009,3,510 June 2009,2,473 June 2009,1,450 May 2009,5,107 May 2009,4,476 May 2009,3,487 May 2009,2,583 May 2009,1,494 April 2009,5,219 April 2009,4,592 April 2009,3,574 April 2009,2,516 April 2009,1,547 March 2009,5,284 March 2009,4,571 March 2009,3,553 March 2009,2,584 March 2009,1,691 February 2009,4,646 February 2009,3,508 February 2009,2,688 February 2009,1,489 [...] Hope this helps a little. Allan (Who thinks it is very sad that he can remember that $c=()=$a=~$b construct...) On 22/03/11 23:26, Bob Muenchen wrote: On 3/22/2011 5:15 PM, Hadley Wickham wrote: I don't doubt that R may be the most popular in terms of discussion group traffic, but you should be aware that the traffic for SAS comprises two separate lists that used to be mirrored, but are no longer linked Usenet -- news://comp.soft-sys.sas (what you counted) listserve -- SAS-L http://www.listserv.uga.edu/archives/sas-l.html R programming challenge: create a script that parses those html pages to compute the total number of messages per week! (Maybe I'll use this in class) Hadley That would be nice! I'd love to have all the sources, which includes various company forums. Sounds like students could be kept busy for quite a few projects. If any pull it off, please send it my way! Cheers, Bob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help with error
As it says at the bottom of every post: PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Without an example that fails, it is hard to help. Allan On 18/03/11 16:26, Savitri N Appana wrote: Hi R users, I am getting the following error when using the splsda function in R v2.12.1: Error in switch(classifier, logistic = { : EXPR must be a length 1 vector What does this mean and how do I fix this? Thank you in advance! Best, Savi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I delete multiple blank variables from a data frame?
On 19/03/11 01:35, Joshua Wiley wrote: Hi Rita, This is far from the most efficient or elegant way, but: ## two column data frame, one all NAs d- data.frame(1:10, NA) ## use apply to create logical vector and subset d d[, apply(d, 2, function(x) !all(is.na(x)))] This works, but apply converts d to a matrix which is not needed, so try d[, sapply(d, function(x) !all(is.na(x)))] if performance is an issue (apply is about 3x slower on your test data frame d, more for larger data frames). For the related problem of removing columns of constant-or-na values, the best I could come up with is zv.1 - function(x) { ## The literal approach y - var(x, na.rm = TRUE) return(is.na(y) || y == 0) } sapply(train, zv.1) See http://www.cybaea.net/Blogs/Data/R-Eliminating-observed-values-with-zero-variance.html for the benchmarks. Allan I am just apply()ing to each column (the 2) of d, the function !all(is.na(x)) which will return FALSE if all of x is missing and TRUE otherwise. The result is a logical vector the same length as the number of columns in d that is used to subset only the d columns with at least some non-missing values. For documentation see: ?apply ?is.na ?all ?[ ?Logic HTH, Josh On Fri, Mar 18, 2011 at 3:35 PM, Rita Carreiraritacarre...@hotmail.com wrote: Dear List Members,I have 55 data frames, each of which with 272 variables and 267 observations. Some of these variables are blanks but the blanks are not the same for every data frame. I would like to write a procedure in which I import a data frame, see which variables are blank, and delete those variables. My data frames have variables named P1 to P136 and Q1 to Q136. I have a couple of questions regarding this issue: 1) Is a loop an efficient way to address this problem? If not, what are my alternatives and how do I implement them?2) I have been playing with a single data frame to try to figure out a way of having R go through the columns and see which ones it should delete. I have figured out how to delete rows with missing data (newdata- na.omit(olddata)) but how do I do it for columns??? Thank you very much for your help and have a great weekend! Rita If you think education is expensive, try ignorance--Derek Bok [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cross-validation in rpart
I assume you mean rpart::xpred.rpart ? The beauty of R means that you can look at the source. For the simple case (where xval is a single number) the code does indeed do simple random sampling xgroups- sample(rep(1:xval, length = nobs), nobs, replace = FALSE) If you want another sampling, then you simply pass a vector as the xval parameter, as the documentation says: “This may also be an explicit list of integers that define the cross-validation groups”. Hope this helps a little. Allan On 19/03/11 09:21, Penny B wrote: I am trying to find out what type of sampling scheme is used to select the 10 subsets in 10-fold cross-validation process used in rpart to choose the best tree. Is it simple random sampling? Is there any documentation available on this? Thanks, Penny. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting graph problem!!
You should probably tell us which part of a- read.csv(http://r.789695.n4.nabble.com/file/n3389613/Life_Expectancies_2008.csv;) hist(a) doesn't do what you expect. (Though often when people say histogram they want something else - what's with that anyhow?) Allan On 19/03/11 13:10, andrew456 wrote: http://r.789695.n4.nabble.com/file/n3389613/Life_Expectancies_2008.csv Life_Expectancies_2008.csv I am trying to plot a histogram base on the file i uploaded above. I am facing a trouble in sorting out the frequency of the life expectancies. I wanted to plot a graph of life expectancies at birth versus frequency,but i have no idea how to change the locations into frequencies taking the range of life expectancies at birth from 40 to 90. I tried using table(Life.Expectancies.at.Birth) to show the frequencies of each number and I do not know how to continue. -- View this message in context: http://r.789695.n4.nabble.com/Plotting-graph-problem-tp3389613p3389613.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting graph problem!!
On 19/03/11 15:04, Allan Engelhardt wrote: You should probably tell us which part of a- read.csv(http://r.789695.n4.nabble.com/file/n3389613/Life_Expectancies_2008.csv;) hist(a) Should be hist(a$Life.Expectancies.at.Birth), of course. Sorry. doesn't do what you expect. (Though often when people say histogram they want something else - what's with that anyhow?) Allan On 19/03/11 13:10, andrew456 wrote: http://r.789695.n4.nabble.com/file/n3389613/Life_Expectancies_2008.csv Life_Expectancies_2008.csv I am trying to plot a histogram base on the file i uploaded above. I am facing a trouble in sorting out the frequency of the life expectancies. I wanted to plot a graph of life expectancies at birth versus frequency,but i have no idea how to change the locations into frequencies taking the range of life expectancies at birth from 40 to 90. I tried using table(Life.Expectancies.at.Birth) to show the frequencies of each number and I do not know how to continue. -- View this message in context: http://r.789695.n4.nabble.com/Plotting-graph-problem-tp3389613p3389613.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regex query (Apache logs)
Some comments: 1. [^\s] matches everything up to a literal 's', unless perl=TRUE. 2. The (.*) is greedy, so you'll need (.*?)\s(.*?)\s(.*?)$ or similar at the end of the expression With those changes (and removing a space inserted by the newsgroup posting) the expression works for me. (pat - readLines(/tmp/b.txt)[1]) [1] ^(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\s([^\\s]*)\\s([^\\s]*)\\s\\[([^\\]]+)\\]\\s\([A-Z]*)\\s([^\\s]*)\\s([^\\s]*)\\\s([^\\s]+)\\s(\\d+)\\s\(.*?)\\\s\(.*?)\\\s\(.*?)\$ regexpr(pat, test, perl=TRUE) [1] 1 attr(,match.length) [1] 436 3. Consider a different approach, e.g. scan(textConnection(test), what=character(0)) Hope this helps Allan On 16/03/11 22:18, Saptarshi Guha wrote: Hello R users, I have this regex see [1] for apache log lines. I tried using R to parse some data (only because I wanted to stay in R). A sample line is [2] (a) I saved the line in [1] into ~/tmp/a.txt and [2] into /tmp/a.txt pat- readLines(~/tmp/a.txt) test- readLines(/tmp/a.txt) test grep(pat,test) returns integer(0) The same query works in python via re.match() (i.e does return groups) Using readLines, the regex is escaped for me. Does Python and R use different regex styles? Cheers Saptarshi [1] ^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s([^\s]*)\s([^\s]*)\s\[([^\]]+)\]\s([A-Z]*)\s([^\s]*)\s([^\s]*)\s([^\s]+)\s(\d+)\s(.*)\s(.*)\s(.*)$ [2] 220.213.119.925 addons.mozilla.org - [10/Jan/2001:01:55:07 -0800] GET /blocklist/3/%8ce33983c0-fd0e-11dc-12aa-0800200c9a66%7D/4.0b5/Fennec/20110217140304/Android_arm-eabi-gcc3/chrome:%2F%2Fglobal%2Flocale%2Fintl.properties/beta/Linux% 202.6.32.9/default/default/6/6/1/ HTTP/1.1 200 3243 - Mozilla/5.0 (Android; Linux armv7l; rv:2.0b12pre) Gecko/20110217 Firefox/4.0b12pre Fennec/4.0b5 BLOCKLIST_v3=110.163.217.169.1299218425.9706 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does R have a const object?
On 16/03/11 13:04, Michael Friendly wrote: On 3/15/2011 2:23 PM, Uwe Ligges wrote: On 15.03.2011 15:53, xiagao1982 wrote: Hi, all, Does R have a const object concept like which is in C++ language? I want to set some data frames as constant to avoid being modified unintentionally. Thanks! Although there is almost never a No in R, the best short answer is: No. This is just the flexibility of R. I've just discovered a new class of geometries based on pi - 2.3 ?Constants Yes, but you can still print(base::pi) and rm(pi) to get back to our flat world, and you can't assign(pi, 4, pos = package:base) Error in assign(pi, 4, pos = package:base) : cannot change value of locked binding for 'pi' Just a feature of the search path. Morale: if you want base::pi, write base::pi. Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does R have a const object?
On 16/03/11 15:04, Gabor Grothendieck wrote: On Wed, Mar 16, 2011 at 10:54 AM, Allan Engelhardtall...@cybaea.com wrote: [...] Yes, but you can still print(base::pi) and rm(pi) to get back to our flat world, and you can't assign(pi, 4, pos = package:base) Error in assign(pi, 4, pos = package:base) : cannot change value of locked binding for 'pi' Just a feature of the search path. Morale: if you want base::pi, write base::pi. Try this: old.pi- pi assignInNamespace(pi, 2.3, ns = base) pi [1] 2.3 base::pi [1] 2.3 Ah! assignInNamespace does an unlockBinding under the covers which obviously is Evil(tm). The answer clearly is to do assignInNamespace(unlockBinding, function (...) {warning(be safe);}, ns=base) in your .Rprofile! Let's Keep R Safe, Folks! Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Singularity problem
Numeric underflow. Try qr.solve(a) Allan On 16/03/11 15:28, Feng Li wrote: Dear R, If I have remembered correctly, a square matrix is singular if and only if its determinant is zero. I am a bit confused by the following code error. Can someone give me a hint? a- matrix(c(1e20,1e2,1e3,1e3),2) det(a) [1] 1e+23 solve(a) Error in solve.default(a) : system is computationally singular: reciprocal condition number = 1e-17 Thanks in advance! Feng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table() reading problem
On 16/03/11 09:20, fre wrote: I have the following problem: I have some string with numbers like k. I want to have a table like the function table() gives. However I am not able to call the first row, the 1, 2, 3, 5, 6 or 9. I tried to do that by a data.frame, but that doesn't seem The first row, as you call it, is accessible as the vector dimnames(x)$k (or as.numeric(dimnames(x)$k) if you need numbers, not strings). to work either. The levels keep bothering me. This is an example of the code: k-c(1,1,1,1,1,3,5,6,2,1,3,9,2,3,1,1,1) table(k) k 1 2 3 5 6 9 9 2 3 1 1 1 Maybe something like matrix(c(as.integer(dimnames(x)$k), x), nrow=2, byrow=TRUE) is what you are looking for? Hope this helps Allan x-table(k) dim(x) [1] 6 x[1] #But i only want the one 1 9 x-data.frame(x) x[1,1] #You are not allowed to use this one for example 3*x[1,1] is impossible [1] 1 Levels: 1 2 3 5 6 9 I hope anyone has an idea of using the table function without this inconvenience. I thought about writing a counter myself, but that seems complicated. Because I have to examine very large examples later on, I don't want to slow the calculations down if possible. Thanks for your help in advance. Frederique -- View this message in context: http://r.789695.n4.nabble.com/table-reading-problem-tp3381174p3381174.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how do we read netcdf / hdf files in R?
Try the ncdf, ncdf4, and hdf5 packages from CRAN. (I have never used the hdf format, but the ncdf4? packages seem to work well.) Allan On 16/03/11 07:25, Yogesh Tiwari wrote: Dear R Users, How do we read netcdf / hdf format files in R ? Also, can we convert netcdf to hdf format in R? Great Thanks, Best Regards, Yogesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how do we read netcdf / hdf files in R?
(For hdf5 you might want to try the development version at http://xweb.geos.ed.ac.uk/~hcp/Rhdf5 which improves error handling, cf. this thread: https://stat.ethz.ch/pipermail/r-devel/2011-March/060111.html .) Allan On 16/03/11 08:53, Allan Engelhardt wrote: Try the ncdf, ncdf4, and hdf5 packages from CRAN. (I have never used the hdf format, but the ncdf4? packages seem to work well.) Allan On 16/03/11 07:25, Yogesh Tiwari wrote: Dear R Users, How do we read netcdf / hdf format files in R ? Also, can we convert netcdf to hdf format in R? Great Thanks, Best Regards, Yogesh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] install.packages barfs on dependencies= argument
I'm sure I used to be able to do my.dependencies- c(Depends, Imports, LinkingTo, Suggests, Enhances) install.packages(animation, lib = Sys.getenv(R_LIBS_USER), dependecies = my.dependencies) but now I get Error in download.file(url, destfile, method, mode = wb, ...) : unused argument(s) (dependecies = c(Depends, Imports, LinkingTo, Suggests, Enhances)) Warning in download.packages(pkgs, destdir = tmpd, available = available, : download of package 'animation' failed Did anything change? Allan sessionInfo() R version 2.12.2 (2011-02-25) Platform: x86_64-redhat-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.utf8 LC_NUMERIC=C [3] LC_TIME=en_GB.utf8LC_COLLATE=en_GB.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8 [7] LC_PAPER=en_GB.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ctv_0.7-0 loaded via a namespace (and not attached): [1] tools_2.12.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] install.packages barfs on dependencies= argument
Groan! Thanks. New baby mean no sleep. I'll make some more coffee. Allan On 15/03/11 08:18, Jim Lemon wrote: On 03/15/2011 06:57 PM, Allan Engelhardt wrote: I'm sure I used to be able to do ... unused argument(s) (dependecies = c(Depends, Imports, LinkingTo, check the spelling. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Serial Date
On 14/03/11 16:00, David Winsemius wrote: On Mar 14, 2011, at 6:36 AM, Allan Engelhardt wrote: On 14/03/11 02:00, Raoni Rosa Rodrigues wrote: [...] I'm working in a project with a software that register date and time data in serial time format. This format is used by excel, for exemple. In this format, 40597.3911423958 is 2011/2/23 09:23:15. Not on my system, Because ... There is a reasonably well understood MS bug in how it handles the non-existent 1900-02-29. (It _still_ accepts that as a valid date and calls it 60. I just re-demonstrated this amazingly persistent bug behavior on the Mac Excel 2011 variant. ) Amazing. Just say 'no' to Excel :-) http://finzi.psych.upenn.edu/R/library/cxxPack/html/serialNumber.html Unfortunately, cxxPack has not compiled for a while because of a RcppDate issue: http://cran.r-project.org/web/checks/check_results_cxxPack.html . But the description is still funny. Thanks. Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding the name vector
On 15/03/11 05:17, Joshua Wiley wrote: Hi Laura, ?is.vector Consider methods::is if you might ever use attributes: x- 1:10 is.vector(x) # [1] TRUE comment(x)- Mine! is.vector(x) # [1] FALSE is(x, vector) # [1] TRUE Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] switch and factors
Maybe I am misunderstanding you, but I think it is documented? The help page says switch(EXPR, ...) [...] If the value of ‘EXPR’ is not a character string it is coerced to integer. [...] Since is.character( factor('x', levels=c('y', 'x')) ) [1] FALSE you get as.integer( factor('x', levels=c('y', 'x')) ) [1] 2 from the coerced to integer part? Therefore the second statement (y) is evaluated regardless of the label (y=). Allan On 09/03/11 02:02, baptiste auguie wrote: Dear list, Reading the help page for ?switch didn't give me more than a hint at what's going on here, x = 5 y = 2 foo- function(a=x){ switch(a, x = x, y = y) } foo(factor('x', levels=c('y', 'x'))) # 2 It seems that switch, when given a factor, uses the numeric codes rather than the string levels as I would have naively expected. Is this deliberate, should it be mentioned on the help page? I had an input that had been invisibly converted to a factor by data.frame(); using switch resulted in serious confusion. Thanks, baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Serial Date
On 14/03/11 02:00, Raoni Rosa Rodrigues wrote: Hello R Help, I'm working in a project with a software that register date and time data in serial time format. This format is used by excel, for exemple. In this format, 40597.3911423958 is 2011/2/23 09:23:15. Not on my system, but this should get you going: ( a- as.POSIXct(2011-02-23 09:23:15.1, tz=GMT) ) # [1] 2011-02-23 09:23:15 GMT ( b- difftime(a, ISOdatetime(1900,1,1,0,0,0), tz=GMT, units=days) ) # Time difference of 40595.39 days as.double(b) # [1] 40595.39 options(digits.secs = 6) ISOdatetime(1900, 1, 1, 0, 0, 0) + 40597.3911423958*60*60*24 # TZ dependent [1] 2011-02-25 09:23:14.702997 GMT Hope this helps a little. Allan First part is number os days since 1900/1/1, and second part is a fraction of a day. I need to make this transformation in R, and use it to make some algebrian operations. I found that function 'serialNumber' of package 'cxxPack' transform a as.data class object in this format, but without time fraction. But I can't find nothing to do the inverse way, transform serial date format in a POSIX class, for exemple. Since now, thanks for your time and atention. All best, Raoni Associate Researcher Fish Passage Center UFMG, Brazil. Fica proibido o uso da palavra liberdade, a qual será suprimida dos dicionários e do pântano enganoso das bocas. A partir deste instante a liberdade será algo vivo e transparente como um fogo ou um rio, e a sua morada será sempre o coração do homem. (Thiago de Mello) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating character vector
You could try scan(what=character(0), sep=,, file=textConnection(first,second,third)) but better to put the strings in a file (say, strings.txt), one per line, and read it using scan(strings.txt, what=character(0), sep=\n) Even better is to understand what you are really trying to do and we can maybe help with that. Hope this helps a little Allan --- http://www.cybaea.net/Blogs/Data/ On 14/03/11 09:29, Maas James Dr (MED) wrote: Is there a way to convince R to create a character vector without using the quotes? This works ex1- c(first,second) but when I try this it doesn't ext- as.character(c(first,second)) it complains. I have many variables to put into character vectors so dispensing with the quotes would be useful. Thanks Jim === Dr. Jim Maas University of East Anglia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] positions and margins differ between X11 and SVG device
Looks the same on my machine, albeit using svg() instead of CairoSVG - what version are you using? One sometimes useful technique is to create the first graphics in a vector format (pdf, svg) and then use an editing program like Inkscape or Illustrator to do any touch-up before publication, as well as resizing and format conversion. Good luck! Allan On 08/03/11 17:02, Frank Schwach wrote: Hi, I'm trying to get a plot ready for publication, which involves getting it to look nice at a rather small size and to fine-tune positioning all the labels and sizes of the margins. I realise that I may not be doing this the right way and I welcome any comments about better approaches to do this. What I have done so far is open an X11 device with the size I want for the final output and I have now managed to arrange all the elements as I wanted in the R script that draws the plot. Saving the plot to a bitmap format preserves all the positions and sizes but the resulting quality is too low. Saving it to svg (also tried CairoSVG) of course gives me excellent quality, but the positions and margins are all completely different from the X11 window. Now I'm wondering how others deal with this (surely common) situation? Figuring out the positions while working blindly (saving to svg file all the time and opening it again to see the result) surely can't be the way to adjust the positions. But what else can I do if X11 output and svg differ so much? In fact, here is a toy example to demonstrate the differences: x1=seq(0,2,by=0.01) y1=2*sin(2*pi*(x1-1/4)) x11(width=1.67, height=2.28) plot(x1,y1) # looks completely different in svg (at least on my machine): CairoSVG(file=test.svg, width=1.67, height=2.28) plot(x1,y1) dev.off() Any help or general advice on formatting plots for publication would be very welcome! Cheers, Frank __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Probabilities outside [0, 1] using Support Vector Machines (SVM) in e1071
predict.svm only returns probabilities for model types (Model$type) less than 2, which I guess are classification models (?). In any case, the probabilities are returned as an attribute which your result clearly lacks. Trivial example: model- svm(Species ~ ., data = iris[-1,], probability=TRUE) ( p- predict(model, newdata=iris[1,], probability=TRUE) ) 1 setosa attr(,probabilities) setosa versicolor virginica 1 0.9796166 0.01147175 0.008911663 Levels: setosa versicolor virginica Hope this helps a little. Allan On 04/03/11 18:54, Adam B. Smith wrote: Hi All, I'm attempting to use eps-regression or nu-regression SVM to compute probabilities but the predict function applied to an svm model object returns values outside [0, 1]: Variable Data looks like: Present X02 X03 X05 X06 X07 X13 X14 X15 X18 1 0 1634 48 2245.469 -1122.0750 3367.544 11105.013 2017.306 40 23227 2 0 1402 40 2611.519 -811.2500 3422.769 10499.425 1800.475 40 13822 3 0 1379 40 2576.150 -842.8500 3419.000 10166.237 2328.756 37 14200 4 0 1869 51 2645.794 -982.2938 3628.088 9610.037 1699.656 43 20762 ... and bgEnv looks similar: X02 X03 X05 X06 X07 X13 X14 X15 X18 1 1001 39 2521.406 -38.0875 2559.494 48507.312 3925.7563 63 20616 2 1587 39 3148.056 -895.0187 4043.075 5937.669 910.9062 55 15156 3 1610 40 4116.918 172.6812 3944.237 2287.431 196.0312 51 2739 4 1495 43 3678.381 236.9250 3441.456 3298.625 23.9875 86 281 5 1564 43 3010.988 -623.6063 3634.594 3416.350 819.6375 34 3848 ... modelFormula- as.formula(Present ~ X02 + X03 + X05 + X06 + X07 + X13 + X14 + X15 + X18) Model- svm( modelFormula, data=Data, gamma=0.25, cost=4, nu=0.10, kernel='radial', scale=TRUE, type='nu-regression', na.action=na.omit, probability=TRUE ) bgPreds- predict( Model, newdata=bgEnv, type='nu-regression', probability=TRUE ) bgPreds looks like: 11 12 13 14 15 16 17 18 0.54675813 0.37587560 0.39526542 0.67043587 -0.03079247 0.16696996 0.04714134 0.06989950 19 20 0.07615735 0.14923408 Notice the negative value. I can also get values1. I had thought argument probability=TRUE would give probabilities. Any help would be greatly appreciated! Adam Adam B. Smith University of California, Berkeley __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transaction list transformation to use rpart.
On 06/03/11 22:34, John Dennison wrote: [...] from data like Customer-ID | Item-ID cust1 | 2 cust1 | 3 cust1 | 5 cust2 | 5 cust2 | 3 cust3 | 2 ... #read in data to a sparse binary transaction matrix txn = read.transactions(file=tranaction_list.txt, rm.duplicates= TRUE, format=single,sep=|,cols =c(1,2)); #tranaction matrix to matrix a-as(txn, matrix) #matrix to data.frame b-as.data.frame(a) I end up with a data.frame like: X X.1 X.2 X.3 X.4 X.5 ... cust1 01 101 cust2 00 101 cust3 01 000 ... However the as.data.frame(a) transforms the matrix into a numeric data.frame so when I implement the rpart algorithm it automatically returns a regression classification tree. I am not sure your approach with rpart is going to give you what you are looking for, but on to your R question: [...] I can't successfully transform the data.frame to a factor. i tried: b_factor-as.factor(b) Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list? You need to do each column individually, i.e. b_factor$X.1 - as.factor(b$X.1) or str( as.data.frame(lapply(b, as.factor)) ) 'data.frame':4 obs. of 4 variables: $ X.2 : Factor w/ 2 levels 0,1: 2 1 2 1 $ X.3 : Factor w/ 2 levels 0,1: 2 2 1 1 $ X.5 : Factor w/ 2 levels 0,1: 2 2 1 1 $ X.Item.ID: Factor w/ 2 levels 0,1: 1 1 1 2 Also have a look at as(txn, data.frame) for a different format that may (with some clean up) be easier to use. as(txn, data.frame) transactionID items 1 cust1{ 2, 3, 5} 2 cust2 { 3, 5} 3 cust3{ 2} 4 Customer-ID { Item-ID} Hope this helps a little. Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] openNLP package error
Apologies that I am late on this thread. On 02/12/10 17:39, Sascha Wolfer wrote: I seem to have a problem with the openNLP package, I'm actually stuck in the very beginning. Here's what I did: install.packages(openNLP) install.packages(openNLPmodels.de, repos = http://datacube.wu.ac.at/;, type = source) library(openNLPmodels.de) library(openNLP) So I installed the main package as well as the supplementary german model. Now, I try to use the sentDetect function: s - c(Das hier ist ein Satz. Und hier ist noch einer - sogar mit Gedankenstrich. Ist das nicht toll?) sentDetect(s, language = de, model = openNLPmodels.de) I get the following error message which I can't make any sense of: Fehler in .jnew(opennlp/maxent/io/SuffixSensitiveGISModelReader, .jnew(java.io.File, : java.io.FileNotFoundException: openNLPmodels.de (No such file or directory) The correct syntax seems to be sentDetect(s, model = system.file(models, de-sent.bin, package = openNLPmodels.de)) but unfortunately I get Error in .jcall(.jnew(opennlp/maxent/io/SuffixSensitiveGISModelReader, : java.io.UTFDataFormatException: malformed input around byte 48 YMMV. But you get the idea on the syntax of the model= argument. This works: sentDetect(s, model = system.file(models, sentdetect, EnglishSD.bin.gz, package = openNLPmodels.en)) # [1] Das hier ist ein Satz. # [2] Und hier ist noch einer - sogar mit Gedankenstrich. # [3] Ist das nicht toll? Hope this helps you a little. Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about Probit Analysis
Sorry I am a bit late to this discussion, but I can't see if you ever got an answer. Anyhow, on your first question: On 31/10/10 19:14, Lorenzo Isella wrote: Dear All, I have some questions about probit regressions. I saw a nice introduction at http://bit.ly/bU9xL5 and I mainly have two questions. (1) The first is almost about data manipulation. Consider the following snippet [...] mydata - read.csv(url(http://www.ats.ucla.edu/stat/r/dae/binary.csv;)) names(mydata) - c(outcome,x1,x2,x3) myprobit - glm(mydata$outcome~mydata$x1+mydata$x2+as.factor(mydata$x3), family=binomial(link=probit)) [...] #Now assume I can make a regression only on x1 myprobit2 - glm(mydata$outcome~mydata$x1, family=binomial(link=probit)) [...] Finally, I generate the data frame mydatanew (see some of its entries below) mydatanew x1 successes failures 1 220 01 2 300 12 3 340 13 4 360 04 5 380 08 [...] where for every value of x1 I count the number of 0 and 1 outcomes (namely number of failures and number of successes). [...] How can I actually feed R with mydatanew to perform again a logistic regression on x1 only? myprobit3 - glm(cbind(successes, failures) ~ x1, family=binomial(link=probit), data = mydatanew ) all.equal(coef(myprobit2), coef(myprobit3), check.attributes = FALSE) # [1] TRUE Your second question we could discuss offline: it is not really an R question, but you might want to have a look at something like the MNP and perhaps survey packages. Hope this helps a little. Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [semi-OT] Using fortune() in an email signature
On 01/09/10 22:18, Dirk Eddelbuettel wrote: [...] Doing the same with Rscript is left as an exercise I like exercises: Rscript --default-packages=fortunes -e print(fortune()) Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding pairs
Something like this may get you started library(igraph) df- data.frame(A=0:7, B=c(rep(NA,3), 0, 1, 3, 7, 2)) # Assuming this is your data b- as.matrix(df) print(b) # A B #[1,] 0 NA #[2,] 1 NA #[3,] 2 NA #[4,] 3 0 #[5,] 4 1 #[6,] 5 3 #[7,] 6 7 #[8,] 7 2 z- graph.edgelist(na.omit(b)) neighborhood(z, Inf, mode = in) #[[1]] #[1] 0 3 5 # #[[2]] #[1] 1 4 # #[[3]] #[1] 2 7 6 # #[[4]] #[1] 3 5 # #[[5]] #[1] 4 # #[[6]] #[1] 5 # #[[7]] #[1] 6 # #[[8]] #[1] 7 6 Hope this helps Allan On 24/08/10 19:28, Mike Rhodes wrote: Dear R Helpers, I am a newbie and recently got introduced to R. I have a large database containing the names of bank branch offices along-with other details. I am into Operational Risk as envisaged by BASEL II Accord. I am trying to express my problem and I am using only an indicative data which comes in coded format. A (branch) B (controlled by) 144 145 146 147 144 148 145 149 147 151 146 .. ... .. ... where 144's etc are branch codes in a given city and B is subset of A. If a branch code appearing in A also appears in B (which is paired with some otehr element of A e.g. 144 appearing in A, also appears in B and is paired with 147 of A and likewise), then that means 144 is controlling operations of bank office 147. Again, 147 itself appears again in B and is paired with bank branch coded 149. Thus, 149 is controlled by 147 and 147 is controlled by 144. Likewise there are more than 700 hundred branch name codes available. My objective is to group them as follows - Bank Branch 144 147149 145 146 151 148 . or even the following output will do. 144 147 149 145 146 151 148 151 .. I understand I should be writing some R code to begin with which I had tried also but as of now I am helpless. Please guide me. Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove Loading required package message
On 27/08/10 10:19, Barry Rowlingson wrote: On Fri, Aug 27, 2010 at 9:09 AM, Sébastien Moretti sebastien.more...@unil.ch wrote: [...] require it quietly: require(sp,quietly=TRUE) Doesn't work for me: require(Matrix, quietly=TRUE) Loading required package: lattice Attaching package: 'Matrix' The following object(s) are masked from 'package:base': det sessionInfo() R version 2.12.0 Under development (unstable) (2010-08-30 r52845) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Matrix_0.999375-43 lattice_0.18-8 ctv_0.6-0 loaded via a namespace (and not attached): [1] grid_2.12.0 tools_2.12.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] any statement equals to 'goto'?
Sounds like you want help(tryCatch) # Catch the error and do something help(next) # Go to the next value of the surrounding loop Hope this helps a little. Allan On 31/08/10 19:34, karena wrote: I have the following code: - result- matrix(NA, nrow=1, ncol=5) for(i in 1:(nsnp-1)) { for(j in (i+1):nsnp){ tempsnp1- data.lme[,i] tempsnp2- data.lme[,j] fm1- lme(trait~sex+age+rmtemp.b+fc+tempsnp1+tempsnp2+tempsnp1*tempsnp2, random=~1|famid, na.action=na.omit) fm2- lme(trait~sex+age+rmtemp.b+fc+tempsnp1+tempsnp2, random=~1|famid, na.action=na.omit) result[1,1]- i result[1,2]- j result[1,3]- colnames(data.lme)[i] result[1,4]- colnames(data.lme)[j] result[1,5]- 2*(summary(fm1)$logLik-summary(fm2)$logLik) if (i==1 j==2) results- result if (i!=1 | j!=2) results- rbind(results, result) } - after submitting this code, I got the error message saying Error in MEEM(object, conLin, control$niterEM) : Singularity in backsolve at level 0, block 1, I know this might be some issue due to the missing values. what I want to do is to skip the 'bad' variable, for example, I got this error message when j=116, so I just wanna ignore variable116, is there a statement that can do this: when getting this error message 'Error in MEEM...', directly goto the next variable? thank you, karena __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Shifting of Principal amount while calculating Present Value
Matrix operations work the other way to what you expect on rows and columns. Try ## Your data: time - c(1, 2, 3) # Please don't use `t' as a variable name cash_flow - c(7, 7, 107) zero_rate - data.frame(rating=factor(c(AAA,AA,A,B)), year1=c(3.60, 3.65, 3.72, 4.10), year2=c(4.17,4.22,4.32,4.67), year3=c(4.73,4.78,4.93,5.25)) zero_rate1 - zero_rate[, -1] ## Make clear we are dealing with matrices (not really needed) zero_rate2 - as.matrix(zero_rate1) ## The calculation colSums(cash_flow/(t(1 + zero_rate2 / 100)^time)) # [1] 106.3549 106.2122 105.7969 104.8871 Here t() is the transpose function. Try to step through each part of the calculation to figure it out. You may also want to look at c(1, 2, 3) / matrix(1, 3, 3) and matrix(sqrt(2), 3, 3)^c(1, 2, 4) for smaller examples. Hope this helps a little. Allan. On 20/08/10 12:24, Sarah Sanchez wrote: Dear R Helpers I have following data - cash_flow = c(7, 7, 107) # 107 = Principle 100 + interest of 7% t = c(1,2,3) and zero rate table as rating year1 year2 year3 AAA3.604.17 4.73 AA 3.654.22 4.78 A 3.72 4.32 4.93 BBB4.104.67 5.25 For each of these ratings I need to calculate the Present Value as (say e.g. for AAA) PV(AAA) = 7/(1+3.60/100) + 7/(1+4.17/100)^2 + 107/(1+4.73/100)^3 which is equal to 106.3549 Similarly when used the respective rates, PV(A) = 106.2122, PV(A) = 105.7969 and PV(BBB) = 104.8871 # ## My problem I have tried the following R code. zero_rate = read.csv('zero_rate_table.csv') zero_rate1 = zero_rate[, -1] cash_flow = c(7, 7, 107) t = c(1,2,3) PV_table = cash_flow / (1+zero_rate1/100)^t ## Then using rowSums, I should get the required result. However, I am getting following output as PV_table year_1year_2 year_3 [1,] 6.756757 6.45078 93.147342 [2,] 6.515675 94.521493 6.680664 [3,] 95.8950646.710123 6.357680 [4,] 6.724304 6.389305 91.773536 rowSums(PV_table) gives [1] 106.35489 107.71783 108.96287 104.88714. Thus, the result is correct only in first case. In second case onwards, there is a shift of final cashflow e.g. in 2nd case, pertaining to year 2, value is 94.521493 which means it includes principal of 100. Likewise in third case, the principal is included in the first cash-flow itself. Again the fourth result is correct. So there is pre-shift of capital. I am not sure whether I am making any mistake in reading the zero_rate csv file or my formula is incorrect. Please guide. Thanking you all in advance Sarah [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] OT: Re: Fwd: R SOFTWARE
On 18/08/10 10:46, Nokuzola Simakuhle wrote: NB: This email and its contents are subject to the Eskom Holdings Limited EMAIL LEGAL NOTICE which can be viewed at http://www.eskom.co.za/email_legalnotice Since your email policy allows for no copying or distribution of your message, you should probably set the X-No-Archive message header. Google it, if you don't know how. I don't actually know if the mailing list honours that header, but at least you've tried. Can't really expect the computer to read and understand your PDF with three pages of legalese. Allan (who collects these kinds of notices) | | [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear regression on several groups
Please read the posting guide and include a standalone example. Maybe you want something like the results from lm(weight ~ Time, data = ChickWeight, subset = Diet==1) lm(weight ~ Time, data = ChickWeight, subset = Diet==2) ## ... etc ... Then you could do (m - lm(weight ~ Time*Diet, data = ChickWeight)) To get the Diet==2 coefficients from above you could use something like sum(coef(m)[c((Intercept), Diet2)]) # Intercept sum(coef(m)[c(Time, Time:Diet2)])# Slope Hope this helps a little. Allan On 12/08/2010 17:11, JesperHybel wrote: I have a simple dataset of a numerical dependent Y, a numerical independent X and a categorial variable Z with three levels. I want to do linear regression Y~X for each level of Z. How can I do this in a single command that is without using lm() applied three isolated times? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT] R on Atlas library
I don't know about the specific function (a simple, stand-alone reproducible example is always helpful, see the posting guide for details), but ATLAS certainly uses all my cores on a test like this: s - 7500 # Adjust for your hardware a - matrix(rnorm(s*s), ncol = s, nrow = s) b - crossprod(a) # Uses all cores here. My configuration of R with ATLAS on Linux (Fedora) is described at http://www.cybaea.net/Blogs/Data/Faster-R-through-better-BLAS.html Maybe your distribution has single-threaded ATLAS and you forgot to rebuild it with enable_native_atlas 1 or the equivalent for your platform? Allan On 06/08/10 15:12, Matthias Gondan wrote: Dear List, I am aware this is slightly off-topic, but I am sure there are people who already had the problem and who perhaps solved it. I am running long-lasting model fits using constrOptim command. At work there is a linux computer (Quad Core, debian) on which I already have compiled R and Atlas, in the hope that things will go faster on that machine. Atlas offers the possibility to be compiled for multiple cores, I used that option, but without success. Performance meter (top) for Linux shows 25% CPU usage, and there is no loss of performance if I run 4 instances of R doing heavy matrix multiplications. Performance drops when a 5th instance of R is doing the same job, so I assume my attempt was not successful. I am sure I did something wrong. Is there anybody who sucessfully runs R with Atlas and all processors? A short description of the necessary steps would be helpful. Searching around the internet was not very encourageing. Some people wrote that it is not so simple to have Atlas fully exploit a multicore computer. I hope this is not too unspecific. Best wishes, Matthias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About R base
Yes, there are differences between R on Windows and R on Linux. You probably want this document: http://cran.r-project.org/bin/windows/base/rw-FAQ.html Hope this helps a little. Allan On 06/08/10 16:47, Stephen Liu wrote: Hi folks, I have R x64 2.11.1 installed on Win 7 64 bit which is running as VM (guest) on Oracle VirtualBox. Is there any difference between it and R on Linux? I can install another R on Debian 5.04, also running as VM. TIA B.R. Stephen L __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory use without running gc()
How about asking the operating system, e.g. ### Method 1 ## Setup cmd - paste(ps -o vsz, Sys.getpid()) ## In your logging routine z - system(cmd, intern = TRUE) cat(Virtual size: , z[2], \n, sep = ) ### Method 2 ## Setup file - paste(/proc, Sys.getpid(), stat, sep = /) what - vector(list, 44); what[[23]] - integer(0) ## In your logging routine vsz - scan(file, what = what, quiet = TRUE)[[23]]/1024 cat(Virtual size: , vsz, \n, sep = ) ### Time the methods library(rbenchmark) benchmark(scan = scan(file, what = what, quiet = TRUE)[[23]]/1024, system = system(cmd, intern = TRUE), columns = c(test, replications, elapsed, relative), order = relative) # test replications elapsed relative # 1 scan 100 0.015 1. # 2 system 100 2.408 160.5333 The scan method should be plenty fast. Change as appropriate if your definition of memory is different from virtual size. Hope this helps a little. Allan On 06/08/10 19:11, Johann Hibschman wrote: jim holtmanjholt...@gmail.com writes: ?memory.size Only works on Windows. I guess I should have specified; this is on Linux. Thanks, Johann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] REmove level with zero observations
On 03/08/10 21:50, GL wrote: If I have a column with 2 levels, but one level has no remaining observations. Can I remove the level? Like this? d - data.frame(a = factor(rep(A, 3), levels = c(A, B))) levels(d$a) # [1] A B d$a - d$a[,drop=TRUE] levels(d$a) # [1] A Hope this helps Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there paid support for R?
On 02/08/10 13:17, Chris Murphy wrote: Hi, I am trying to get R installed at work and I was asked if there were any companies that offered support. There are. What country are you looking to support? What do you mean by support? (It can be anything from installation through SLAs for help on specific packages to someone we can sue) Allan. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply for logical conditions
`|` is a binary operator which is why the apply will not work. See help(Reduce) For example, set.seed(1) data - data.frame(A = runif(10) 0.5, B = runif(10) 0.5, C = runif(10) 0.5) Reduce(`|`, data) # [1] TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE Hope this helps. Allan On 02/08/10 21:35, Alastair wrote: Hi, I've got some boolean data in a data.frame in the form: XYZA B C [1] T TFT F F [2] F TTF F F . . . What I want to do is create a new column which is the logical disjunction of several of the columns. Just like: new.column- data$X | data$Y | data$Z However I don't want to hard code the particular columns into the expression like that. I've tried using apply row wise with `|` as the function: columns- c(X,Y,Z) apply(data[,columns], 1,`|`) This doesn't seem to do what I would have expected, does anyone have any advice how to use the the apply or similar function to perform a boolean operation on each row (and a specific subset of the columns) in a data frame? Thanks, Alastair __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading timestamp column from MySQL
On 29/07/10 00:20, harsh yadav wrote: Hi, I am reading a SQL (MySQL) table in R data frame. When I read in the table that has a timestamp data-type field, R gives it the following format:- 1.236887e+12 So when I want to manipulate a column with timestamp = 1236887146615 It returns me multiple rows, as many timestamps gets converted to same value: 1.236887e+12 Any ideas of how this could be dealt with, so that I can get the entire timestamp field. It should just be a printing issue, not a problem with the data, in which case you can just run you analysis as usual. If it *is* an issue with the data, and not the printing, then we probably need more information, see the posting guide. If you are worried about the formatting, look at options(scipen); try a value of 5 or so to get you started. [*Off topic*: why does options(scipen = 2^31-13); print(1236887146615); produce different output [1] from options(scipen = 2^31-12); print(1236887146615); and why does the rather intuitive approach suggested by one of my colleagues, options(scipen = +Inf); print(1236887146615); produce warnings (and why does warnings() then produce even more warnings, ad infinitum)? ] Hope this helps, Allan [1] at least under trunk and also 2.11.1 sessionInfo() R version 2.12.0 Under development (unstable) (2010-07-28 r52631) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ctv_0.6-0 loaded via a namespace (and not attached): [1] tools_2.12.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function to return variable name
On 29/07/10 04:21, Jeremy Miles wrote: I'd like a function that returns the variable name. As in: MyData$Var1 Would return: Var1 Not quite sure what you mean, but does this get you started? nn - function(x) deparse(substitute(x)) str( z - nn(airquality$Month) ) # chr airquality$Month ## Not sure what now - maybe: strsplit(z, $, fixed = TRUE)[[1]][-1] # [1] Month You'll have to figure out with a variable called foo$bar$baz and with airquality[Month] etc. Hope this helps a little. Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Checking package licences including dependencies?
I only recently discovered options(available_packages_filters = list(add = TRUE, license/FOSS)) [cf. help(available.packages, package=utils) in R 2.10.0 or later] which goes nicely with my options(checkPackageLicense = TRUE) [new in R 2.11]. But now I want to purge my library of packages I would not have installed had I known about this option earlier (I'm looking at you, gam!). Short of erasing the whole directory of libraries and re-installing it, is there an easy way of achieving this? I could probably roll something based on tools:::analyze_license() but I think the erase-and-reinstall option might be easier in this case :-) Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Checking package licences including dependencies?
On 27/07/10 13:20, Ben Bolker wrote: Allan Engelhardtallaneat cybaea.com writes: I only recently discovered options(available_packages_filters = list(add = TRUE, license/FOSS)) [cf. help(available.packages, package=utils) in R 2.10.0 or later] which goes nicely with my options(checkPackageLicense = TRUE) [new in R 2.11]. But now I want to purge my library of packages I would not have installed had I known about this option earlier (I'm looking at you, gam!). What's the problem with gam? The ACM license on akima (on which it depends) that requires permission/licensing for commercial use? In a word: yes. It depends on a non-FOSS package. This is what the new options protect us against. For this particular project, I want to avoid all non-FOSS dependencies. [OT: Specifically for gam, I am not sure it is really OK to release it as GPL-2 when it depends on a non-free library that, as far as I can see, does not meet the definition of a system library; see http://www.gnu.org/licenses/gpl-faq.html#GPLIncompatibleLibs for a discussion. But IANAL and for this project is is a moot point: I want to be clean.] Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Checking package licences including dependencies?
Great, thanks. I couldn't quite get your syntax to work, but z - packageStatus(.libPaths()[1])[[1]] unname( z$Package[z$Status == unavailable] ) seems to do the trick for me. Thanks again. Allan On 27/07/10 16:31, Prof Brian Ripley wrote: If I understand you correctly, set the filter and use packageStatus(). Its summary() method tells you which packages you have installed which are 'unavailable'. E.g. my Mac (with pkgType = source) shows in R-devel summary(packageStatus(.libPaths()[1]))$Libs[[1]]$unavailable [1] BayesXEVER TSA akima degreenet difR [7] ergm ffgam isa2 latentnet locfit [13] mapproj rtiff statnet tripack On Tue, 27 Jul 2010, Allan Engelhardt wrote: I only recently discovered options(available_packages_filters = list(add = TRUE, license/FOSS)) [cf. help(available.packages, package=utils) in R 2.10.0 or later] which goes nicely with my options(checkPackageLicense = TRUE) [new in R 2.11]. But now I want to purge my library of packages I would not have installed had I known about this option earlier (I'm looking at you, gam!). Short of erasing the whole directory of libraries and re-installing it, is there an easy way of achieving this? I could probably roll something based on tools:::analyze_license() but I think the erase-and-reinstall option might be easier in this case :-) Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the real dimnames
On 26/07/10 19:01, Michael Lachmann wrote: Hi, R seems to have a feature that isn't used much, which I don't really now how to call. But, the dimnames function, can in addition to giving names to rows/columns/dim 3 rows/dim 4 rows... can also give labels to the dimensions themselves. Thus, I can do: A = matrix(1:9,3,3) dimnames(A) = list(from=c(), to=c() ) If you just want to set the names of the dimnames, just do names(dimnames(A)) - c(from, to) Remember that dimnames is a list and all lists have names. Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] start and end times to yes/no in certain intervall
I like loops for this kind of thing so here is one: df- structure(list(start = c(15:00, 15:00, 15:00, 11:00, 14:00, 14:00, 15:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00), end = c(16:00, 16:00, 16:00, 12:00, 16:00, 15:00, 16:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00)), .Names = c(start, end), row.names = c(NA, 20L), class = data.frame) duration- with(df, floor( as.numeric( difftime(strptime(end, format=%H:%M), strptime(start, format=%H:%M), units = hours start- as.integer(substr(df$start, 1, 2)) a- matrix(FALSE, nrow = NROW(df), ncol = 24, dimnames = list(1:NROW(df), paste(t, 0:23, sep = ))) for (i in 1:NROW(df)) { names- paste(t, seq(start[i], by = 1L, length.out = duration[i]), sep = ) a[i, names]- TRUE } r- range(which(apply(a, 2, any))) a- a[, r[1]:r[2]] # Drop columns we do not need head(a) # t11 t12 t13 t14 t15 # 1 FALSE FALSE FALSE FALSE TRUE # 2 FALSE FALSE FALSE FALSE TRUE # 3 FALSE FALSE FALSE FALSE TRUE # 4 TRUE FALSE FALSE FALSE FALSE # 5 FALSE FALSE FALSE TRUE TRUE # 6 FALSE FALSE FALSE TRUE FALSE Hope this helps a little. Allan On 23/07/10 11:02, Stefan Uhmann wrote: Hi List, I have start and end times of events structure(list(start = c(15:00, 15:00, 15:00, 11:00, 14:00, 14:00, 15:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00), end = c(16:00, 16:00, 16:00, 12:00, 16:00, 15:00, 16:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00)), .Names = c(start, end), row.names = c(NA, 20L), class = data.frame) and I would like the data to look like this: t9 t10 t11 t12 t13 t14 t15 t16 t17 1 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE 2 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE 3 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE 4 FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE 5 FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE 6 FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE 7 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE 8 FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE 9 FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE 10 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE Which means, that I just get a TRUE for every hour the event was taking place. A finishing time of 16:00 means that t16 is FALSE, because the event was finished until 16:00; 16:15 as end time would result in t16 being TRUE. It would be nice if the function would add the variables needed (t9 ..) as well and depending on the times put in (no t9 if there is no event starting before 10:00). Thanks for any suggestion, Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regex in an IF statement?
help(difftime) is probably what you want. If not, please post a standalone example Allan On 23/07/10 15:18, Jim Hargreaves wrote: Dear List I have the POSIX timestamps of two events, A and B respectively. I want an expression that tests if A occurs within 4 weeks of B. Something like: if (ts_A is between ts_B + 4(weeks) and ts_B - 4) ... Can probably done by regex but the regex docu for R says nothing about if statements. Any help greatly appreciated, Regards, Jim Hargreaves __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] converting a time to nearest half-hour
The arithmetic that David describes should work fine (POSIXct is internally in UTC) unless you are in the Chatham Islands (which has a UTC+12:45 time zone [1]) or Nepal (UTC+05:45 [2]) or some such place with a funny idea about when the 1/2 hour mark is. The formatting of the output may be tricky, if the original poster really want the (past/future) time zone as opposed to the current one, which is what his text could be read to suggest. Good test cases would be 2010-03-28 01:46:00 GMT which should (presumably) round to 2010-03-28 03:00:00 BST and 2010-10-31 02:46 BST which probably rounds to 2010-10-31 02:00 GMT. I can't quite get it to work in R, but it should be possible Allan [1] https://secure.wikimedia.org/wikipedia/en/wiki/UTC%2B12:45 [2] https://secure.wikimedia.org/wikipedia/en/wiki/UTC%2B05:45 On 23/07/10 16:35, David Winsemius wrote: On Jul 23, 2010, at 11:20 AM, murali.me...@avivainvestors.com murali.me...@avivainvestors.com wrote: Hi folks, I've got a POSIXct datum as follows: Sys.time() [1] 2010-07-23 11:29:59 BST I want to convert this to the nearest half-hour, i.e., to 2010-07-23 11:30:00 BST (If the time were 11:59:ss, I want to convert to 12:00:00). How to achieve this? Couldn't you just coerce to numeric, divide by 60(sec)*30(half-hour minutes), round to integer, multiply by 60*30, coerce to POSIXct? _ David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to deal with more than 6GB dataset using R?
read.table is not very inefficient IF you specify the colClasses= parameter. scan (with the what= parameter) is probably a little more efficient. In either case, save the data using save() once you have it in the right structure and it will be much more efficient to read it next time. (In fact I often exit R at this stage and re-start it with the .RData file before I start the analysis to clear out the memory.) I did a lot of testing on the types of (large) data structures I normally work with and found that options(save.defaults = list(compress=bzip2, compression_level=6, ascii=FALSE)) gave me the best trade-off between size and speed. Your mileage will undoubtedly vary, but if you do this a lot it may be worth getting hard data for your setup. Hope this helps a little. Allan On 23/07/10 17:10, babyfoxlo...@sina.com wrote: nbsp;Hi there, Sorry to bother those who are not interested in this problem. I'm dealing with a large data set, more than 6 GB file, and doing regression test with those data. I was wondering are there any efficient ways to read those data? Instead of just using read.table()? BTW, I'm using a 64bit version desktop and a 64bit version R, and the memory for the desktop is enough for me to use. Thanks. --Gin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to deal with more than 6GB dataset using R?
On 23/07/10 17:36, Duncan Murdoch wrote: On 23/07/2010 12:10 PM, babyfoxlo...@sina.com wrote: [...] You probably won't get much faster than read.table with all of the colClasses specified. It will be a lot slower if you leave that at the default NA setting, because then R needs to figure out the types by reading them as character and examining all the values. If the file is very consistently structured (e.g. the same number of characters in every value in every row) you might be able to write a C function to read it faster, but I'd guess the time spent writing that would be a lot more than the time saved. And try the utils::read.fwf() function before you roll your own C code for this use case. If you do write C code, consider writing a converter to .RData format which R seems to read quite efficiently. Hope this helps. Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sparsity
The functions {summary,bandwidth}.rq() in the package quantreg looks promising, but it is not really my area... http://lmgtfy.com/?q=Hall-Sheater+sparsity+%22r-project%22 Hope this helps a little. Allan On 23/07/10 21:02, Nathalie Gimenes wrote: Hi All, I would like to estimate the sparsity function using Hall Sheater bandwidth. Is there any command for this? Thanks, NGS [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] please help me for this Error: cannot listen to TCP port 15259
The short-term work-around is probably to use the help_type= argument to help, e.g. help(help, help_type=text) Hope this helps a little. Allan On 22/07/10 07:30, Na.Ebrahimi wrote: hi, I get an Error when I try to run function help in R, it says starting httpd help server ...Error: cannot listen to TCP port 15259 I would be so glad if you help me __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drop firms in unbalanced panel if not more than 5 observations in consecutive years for all variables
One option: /## Your data:/ data - structure(list(id = structure(c(1L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L), .Label = c(a, b, c, d), class = factor), year = c(2000L, 2000L, 2001L, 1999L, 2000L, 2001L, 2002L, 1998L, 1999L, 2000L, 2001L, 2002L), y = c(1L, NA, 3L, 1L, 2L, 4L, 5L, 6L, 5L, 6L, 7L, 3L), z = c(1L, 2L, 3L, 1L, 2L, NA, 4L, 5L, NA, 6L, 7L, 6L)), .Names = c(id, year, y, z), class = data.frame, row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)) /## Drop missing data/ Data - na.omit(data) /## Find three years in a row with data/ n - names(which(apply(table(Data$id, Data$year), 1, function(x) {z-rle(x); z-z[z$values0]; any(z$lengths=3)}))) /## Now get the original data for these ids/ data[data$id %in% n, ] /#id year y z # 8 d 1998 6 5 # 9 d 1999 5 NA # 10 d 2000 6 6 # 11 d 2001 7 7 # 12 d 2002 3 6/ This is probably not optimal, but I hope it helps a little. Allan On 22/07/10 10:18, Christian Schoder wrote: Dear R-user, a few weeks ago I consulted the list-serve with a similar question. However, my task changed a little but sufficiently to get lost again. So I would appreciate any help on the following issue. I use the plm package and work with firm-level data in a panel. I would like to eliminate all firms that do not fulfill the requirement of having an observation in every variable used for at least x consecutive years. For illustration of the problem assume the following data set data id year y z 1 a 2000 1 1 2 b 2000 NA 2 3 b 2001 3 3 4 c 1999 1 1 5 c 2000 2 2 6 c 2001 4 NA 7 c 2002 5 4 8 d 1998 6 5 9 d 1999 5 NA 10 d 2000 6 6 11 d 2001 7 7 12 d 2002 3 6 where id is the index of the firm, year the index for the year, and y and z are variables. Now, I would like to get rid of all firms with, let's say, less than 3 consecutive years in which there are observations for every variable. Hence, the procedure should yield data.reduced id year y z 1 d 1998 6 5 2 d 1999 5 NA 3 d 2000 6 6 4 d 2001 7 7 5 d 2002 3 6 Thank you very much for any help! Cheers, Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL/R
There are so many ways Here is one: aggregate(v ~ u, data=X, function(...) length(unique(...))) #u v # 1 T1 2 # 2 T2 1 Hope this helps Allan. On 22/07/10 12:52, Gildas Mazo wrote: Dear R users, I want to aggregate data in the following way: ### X- data.frame(u = c(T1,T1,T1,T2), v=c(a,a,b,a)) X library(sqldf) sqlOut- sqldf(select count(distinct(v)) from X group by u) sqlOut ### Now I want to get the same result without using SQL. How can I achieve that ? Thanks for your help, Gildas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave special characters problem
A standalone example is always helpful. The following works for me, so I am probably not understanding your problem: ---[BEGIN: umlaut.Rnw]--- \documentclass[a4paper]{article} \usepackage{ucs} \usepackage[utf8x]{inputenc} \begin{document} test,echo=TRUE= x - data.frame(Geschäftslage=1:10) summary(x) @ \end{document} ---[END: umlaut.Rnw]--- $ R CMD Sweave umlaut.Rnw $ R CMD pdflatex umlaut.tex $ gnome-open umlaut.pdf Allan On 22/07/10 13:19, Bunny, lautloscrew.com wrote: Dear all, I use Sweave to create my reports. Unfortunately my script crashes whenever I my R code contains special characters like umlauts. Is there a way to to escape special characters in Sweave... This is the line that crashes Sweave: gl_bybranch = ddply(new_wans,.(period,Branchen), function(X) data.frame(Geschäftslage=mean(X$sentiment))) Unfortunately I can't just rename it, because I it´s displayed in the legend of graphics later on. Thx for any suggestions! best matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Recommended way of requiring packages of a certain version?
What is the recommended way of requiring a certain version when loading a package (or, indeed, from R itself)? Perl has the require module version use module version require version use version constructs which is kind of what I am looking for (especially 'use' which is evaluated at compile time), but R seems to have lost the version= argument to require(). The best I have been able to come up with are constructs of the form if ( utils::compareVersion(utils::packageDescription(data.table, fields=Version), 1.5) 0 ) stop(Need data.table version 1.5 or later) if ( utils::compareVersion(as.character(getRversion()), 2.11) = 0 ) stop(Does not work yet with latest R.) But this is tedious and error prone. I have created my own require() function to automate this (it takes a vector of versions and a vector of comparisons and only load the package if they are all met), but it is non-standard and just that much harder for my colleagues to maintain. Somebody must already have done this? Allan PS: Shouldn't utils::compareVersion(2.11.0, 2.11) return zero instead of one? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: coeficient of determination
Try RSiteSearch(coefficient of determination) which seems to list some promising candidates. Allan On 15/07/10 16:47, Alireza Ehsani wrote: Hello there how can i calculate coefficient of determination with R? Kind regards Ali Reza Ehsani Ph.d. student [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Recommended way of requiring packages of a certain version?
On 16/07/10 09:32, Paul Hiemstra wrote: Hi Allan, When you create an R package you can specify in the DESCRIPTION file that your package depends on a certain R version and versions of packages. For example: [...] Depends: R (= 2.7.0), methods, sp (= 0.9-4), gstat (= 0.9-58) Thanks Paul, this is helpful. It doesn't quite do what I want because support for operators other than `=` is erratic (the documentation at [1] seems to suggest that R CMD INSTALL supports any operator while install.packages() only supports `=`). Also I am not sure if it is checked on every package load? [...] So distributing code to other people is preferably done using R packages, which gives you this option. Agreed in principle (and that is kind of what I am developing), but sometimes you just want to send a piece of analysis you are working on. But thanks for the pointer. Allan [1] http://cran.r-project.org/doc/manuals/R-exts.html#fn-2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-pkgs] New package list for analyzing list surveyexperiments
On 13/07/10 19:16, Erik Iverson wrote: Raubertas, Richard wrote: I agree that 'list' is a terrible package name, but only secondarily because it is a data type. The primary problem is that it is so generic as to be almost totally uninformative about what the package does. For some reason package writers seem to prefer maximally uninformative names for their packages. To take some examples of recently announced packages, can anyone guess what packages 'FDTH', 'rtv', or 'lavaan' do? Why the aversion to informative names along the lines of 'Freq_dist_and_histogram', 'RandomTimeVariables', and 'Latent_Variable_Analysis', respectively? I'm sure it's part tradition... ls cat rm cp mv su No need to leave R, which has some 9 one-letter symbols, 35 two-letter symbols, and 121 three letter symbols in the base and core packages. It is just unreal. (The equivalent numbers from my Linux system are 3, 56, and 148, and one of those one-letter commands is R!!) Quiz yourself on these: c C D F I q s t T ar as bs by cm de df dt el gc gl if Im is lh lm ls lu ns pf pi pt qf qq qr qt Re rf rm rt sd te ts VA vi abs acf ACF AIC all aml any aov Arg ave bam bcv bdf BIC bmp BOD box bxp cat cav ccf cch cd4 cgd co2 CO2 col cor cos cov cut DDT det dim Dim dir end exp fft fgl fir fix for gam get glm gls Gun hat hcl hsv IGF IQR Kfn knn lag lcm lda lme log lqs mad Map max mca min mle Mod new nlm nls npk nsl OME one Ops pam par pbc PBG pdf pie png ppr qda raw rep rev rfs rgb rig rle rlm row rug seq sin SOM SSD SSI stl str sub sum svd svg tan tar tau tcl tmd try tsp two ucv unz url var x11 X11 xor Generated from R --vanilla with: for (p in c(base, boot, class, cluster, codetools, datasets, foreign, graphics, grDevices, grid, KernSmooth, lattice, MASS, Matrix, methods, mgcv, nlme, nnet, rpart, spatial, splines, stats, stats4, survival, tcltk, tools, utils)) library(p, character.only=TRUE) rm(p) one - unique(grep(^[[:alnum:]]+$, apropos(^.$), value=TRUE)) two - unique(grep(^[[:alnum:]]+$, apropos(^..$), value=TRUE)) three - unique(grep(^[[:alnum:]]+$, apropos(^...$), value=TRUE)) and from the bash shell with ls -1 {/usr,}/bin/? 2/dev/null ls -1 {/usr,}/bin/?? | perl -ne 'print substr $_,-3' | sort -u | wc -l ls -1 {/usr,}/bin/??? | perl -ne 'print substr $_,-4' | sort -u | wc -l Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SARIMA model
Try RSiteSearch(SARIMA) Allan On 13/07/10 14:15, FMH wrote: Dear All, Could someone please advice me the appropriate package for fitting the SARIMA model? Thanks Fir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question regarding varImpPlot results vs. model$importance data on package RandomForest
Use the source, Luke. varImpPlot calls randomForest:::importance.randomForest (yeah, that is three colons) and reading about the scale= parameter in help(importance, package=randomForest) should enlighten you. For the impatient, try varImpPlot(mtcars.rf, scale=FALSE) Hope this helps a little. Allan On 14/07/10 00:46, Mike Williamson wrote: Hi everyone, I have another Random Forest package question: - my (presumably incorrect) understanding of the varImpPlot is that it should plot the % increase in MSE and IncNodePurity exactly as can be found from the importance section of the model results. - However, the plot does not, in fact, match the importance section of the random forest model. E.g., if you use the example given in the ?randomForest, you will see the plot showing the highest few %IncMSE values around 17 or 18%. But if you look at the $importance, it is 9.7, 9.4, 7.7, and 7.3. Perhaps more importantly, for the plot, it will show wt is highest %MSE, then disp, then cyl, then hp; whereas the $importance will show wt, then disp, then hp, then cyl. And the ratios look somewhat different, too. Here is the code for that example: set.seed(4543) data(mtcars) mtcars.rf- randomForest(mpg ~ ., data=mtcars, ntree=1000, keep.forest=FALSE, importance=TRUE) varImpPlot(mtcars.rf) I am using version 2.11.1 of 'R' and version 4.5-35 of Random Forest. I don't really care or need for the varImpPlot to work just right. But I am not sure which is accurate: the varImpPlot or the $importance section. Which should I trust more, especially when they disagree appreciably? Thanks! Mike Telescopes and bathyscaphes and sonar probes of Scottish lakes, Tacoma Narrows bridge collapse explained with abstract phase-space maps, Some x-ray slides, a music score, Minard's Napoleanic war: The most exciting frontier is charting what's already here. -- xkcd -- Help protect Wikipedia. Donate now: http://wikimediafoundation.org/wiki/Support_Wikipedia/en [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convergent series
Not 100% if this is what you are looking for, but maybe Reduce(+, x) will do it? E.g. Reduce(+, 1/2^(1:10)) # [1] 0.9990234 Hope this helps. Allan On 14/07/10 11:57, David Bickel wrote: What are some reliable R functions that can compute the value of a convergent series? David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qplot in ggplot2 not working any longer - (what did I do?)
The problem is that install.packages() and friends forks R and there is no way to specify general options to that process (you can do the configure.{args,vars} but nothing else). One work-around could be to download the package and install it with R --vanilla CMD INSTALL from the command line. But it is probably better to fix your .Rprofile so it doesn't break the install; see interactive() and commandArgs() for functions that may be useful. Hope this helps a little. Allan On 14/07/10 22:03, stephen sefick wrote: This is the first time that I have tried to update packages with a tinkered around with .Rprofile. I start R with R --vanilla and it does not load my .Rprofile, but when I issue the command update.packages() R downloads the packages as expected, but then seems to load .Rprofile before compiling the packages sources. What am I doing wrong? kindest regards, Stephen Sefick see- Session info output and .Rprofile Ubuntu 10.04 R 2.11.1 Session info: R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C [3] LC_TIME=en_US.utf8LC_COLLATE=en_US.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 [7] LC_PAPER=en_US.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] graphics grDevices utils datasets stats grid methods [8] base other attached packages: [1] gpclib_1.5-1StreamMetabolism_0.03-3 maptools_0.7-34 [4] lattice_0.18-8 sp_0.9-64 foreign_0.8-40 [7] chron_2.3-35zoo_1.6-3 vegan_1.17-3 [10] ggplot2_0.8.8 proto_0.3-8 reshape_0.8.3 [13] plyr_0.1.9 .Rprofile: #source USGS graphing function for base data source(~/R_scripts/USGS.R) source(~/R_scripts/publication_ggplot2_theme.R) source(~/R_scripts/llScript.R) #set help_type options(help_type=html) #exit to get around anoying q behavior exit- function(save=no){q(save=save)} #most used libraries library(ggplot2) library(vegan) library(StreamMetabolism) #allow gpclib package to be used gpclibPermit() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] taking daily means from hourly data
This is one way: df- data.frame(Time=as.POSIXct(2009-01-01, format=%Y-%m-%d) + seq(0, 60*60*24*365-1, 60*60), lev.morgan=3+runif(24*365), lev.lock2=3+runif(24*365), flow=1000+rnorm(24*365, 200), direction=runif(24*365, 0, 360), velocity=runif(24*365, 0, 10)) (df2- aggregate(df[ , -1], list(date=as.Date(df$Time)), FUN=mean, na.rm=TRUE)) Hope this helps Allan On 15/07/10 05:52, Meissner, Tony (DFW) wrote: I have a data frame (morgan) of hourly river flow, river levels and wind direction and speed thus: Time hour lev.morgan lev.lock2 lev.lock1 flow direction velocity 1 2009-07-06 15:00:00 15 3.266 3.274 3.240 1710.6 180.282 4.352 2 2009-07-06 16:00:00 16 3.268 3.272 3.240 1441.8 192.338 5.496 3 2009-07-06 17:00:00 17 3.268 3.271 3.240 1300.1 202.294 2.695 4 2009-07-06 18:00:00 18 3.267 3.274 3.241 1099.1 237.161 2.035 5 2009-07-06 19:00:00 19 3.265 3.277 3.243 986.6 237.576 0.896 6 2009-07-06 20:00:00 20 3.266 3.281 3.242 1237.6 205.686 1.257 7 2009-07-06 21:00:00 21 3.267 3.280 3.242 1513.326.080 0.664 8 2009-07-06 22:00:00 22 3.267 3.281 3.242 1819.5 264.280 0.646 9 2009-07-06 23:00:00 23 3.267 3.281 3.242 1954.4 337.137 0.952 10 2009-07-07 00:00:000 3.267 3.281 3.242 1518.9 260.006 0.562 11 2009-07-07 01:00:001 3.267 3.281 3.242 1082.6 252.172 0.673 12 2009-07-07 02:00:002 3.267 3.280 3.243 1215.9 190.007 1.286 13 2009-07-07 03:00:003 3.267 3.279 3.244 1093.5 260.415 1.206 : : : : : : :: : : : : : : : :: : Time is of class POSIXct I wish to take daily means of the flow, levels, and wind parameters and put them into a new dataframe. I envisage doing this with the following example code: morgan$fTime- factor(substr(as.character(morgan$Time),1,10)) dflow- tapply(morgan[,flow], morgan$fTime, mean) day- tapply(morgan[,Time], morgan$fTime, mean) : : daily- as.data.frame(cbind(day,dflow, dlev.morg,dlev.lock2, ...)) daily$day- with(daily, as.POSIXct(1970-01-01, %Y-%m-%d, tz=Australia/Adelaide) + day) rownames(daily)- NULL Is there a more efficient way of doing this? I am running R-2.11.0 under Windows XP Tschüà Tony Meissner Principal Scientist (Monitoring) Resources Monitoring Group Science, Monitoring and Information Division Department for Water Imagine © *(ph) (08) 8595 2209 *(mob) 0401 124 971 *(fax) (08) 8595 2232 * 28 Vaughan Terrace, Berri SA 5343 PO Box 240, Berri SA 5343 DX 51103 ***The information in this e-mail may be confidential and/or legally privileged. Use or disclosure of the information by anyone other than the intended recipient is prohibited and may be unlawful. If you have received this e-mail in error, please advise by return e-mail or by telephoning +61 8 8595 2209 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply is slower than for loop?
On 12/07/10 08:16, Liviu Andronic wrote: On Fri, Jul 9, 2010 at 9:11 PM, Gene Leynesgleyne...@gmail.com wrote: [...] Check Rnews for an article discussing proper usage of apply and for. Liviu I am guessing you are referencing @article{Rnews:Ligges+Fox:2008 http://cran.r-project.org/doc/Rnews/bib/Rnews.html#Rnews:Ligges+Fox:2008, author = {Uwe Ligges and John Fox}, title = {{{R} {H}elp {D}esk}: {H}ow Can {I} Avoid This Loop or Make It Faster?}, journal = {R News}, year = 2008, volume = 8, number = 1, pages = {46--50}, month = {May}, url = {http://CRAN.R-project.org/doc/Rnews/}, pdf = {http://CRAN.R-project.org/doc/Rnews/Rnews_2008-1.pdf} } Allan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply is slower than for loop?
On 09/07/10 21:19, Duncan Murdoch wrote: On 09/07/2010 4:11 PM, Gene Leynes wrote: I thought the apply functions are faster than for loops, but my most recent test shows that apply actually takes a significantly longer than a for loop. Am I missing something? Probably not. apply() needs to figure out the shape of the results it gets from each row in order to put them into the final result matrix. You know that in advance, and set up the result to hold them, so your calculation would be more efficient. Plus, in the way it is set up, apply has to make a much larger temporary matrix. If you do ds1- 1+ds library(rbenchmark) benchmark(apply=apply(1+ds, 1, cumprod), forloop=for (i in 1:nrow(ds)){ y2[i,]-cumprod(1+ds[i,]) }, apply2=apply(ds1, 1, cumprod), replications=1, columns=c(test,elapsed,relative,user.self,sys.self), order=elapsed) # test elapsed relative user.self sys.self # 2 forloop 1.863 1.00 1.8610.000 # 3 apply2 2.175 1.167472 1.9340.239 # 1 apply 2.443 1.311326 2.1080.334 you can sense some of the impact of that. But if it is speed you are after, sapply may be even faster (you'll need to t() the result again): benchmark(forloop=for (i in 1:nrow(ds)){ y2[i,]-cumprod(1+ds[i,]) }, sapply={dsone-1+ds;sapply(1:NROW(dsone), function(i) cumprod(dsone[i,]))}, replications=1, columns=c(test,elapsed,relative,user.self,sys.self), order=elapsed) # test elapsed relative user.self sys.self # 2 sapply 1.539 1.00 1.3000.239 # 1 forloop 1.878 1.220273 1.8780.000 zz- sapply(1:NROW(ds1), function(i) cumprod(dsone[i,])) identical(t(zz), y2) # [1] TRUE Hope this helps a little. Allan The *apply functions are designed to be convenient and clear to read, not necessarily fast. Duncan Murdoch It doesn't matter much if I do column wise calculations rather than row wise ## Example of how apply is SLOWER than for loop: #rm(list=ls()) ## DEFINE VARIABLES mu=0.05 ; sigma=0.20 ; dt=.25 ; T=50 ; sims=1e5 timesteps = T/dt ## MAKE PHI AND DS phi = matrix(rnorm(timesteps*sims), nrow=sims, ncol=timesteps) ds = mu*dt + sigma * sqrt(dt) * phi ## USE APPLY TO CALCULATE ROWWISE CUMULATIVE PRODUCT system.time(y1 - apply(1+ds, 1, cumprod)) ## UNTRANSFORM Y1, BECAUSE ROW APPLY FLIPS THE MATRIX y1=t(y1) ## USE FOR LOOP TO CALCULATE ROWWISE CUMULATIVE PRODUCT y2=matrix(NA,nrow(ds),ncol(ds)) system.time( for (i in 1:nrow(ds)){ y2[i,]-cumprod(1+ds[i,]) } ) ## COMPARE RESULTS TO MAKE SURE THEY DID THE SAME THING str(y1) str(y2) all(y1==y2) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Current script name from R
I'm assuming you are using Rscript (please provide self-contained examples when posting) in which case you could look for the element in (base|R.utils)::commandArgs() that begin with the string --file= - the rest is the file name. See the asValues= parameter in help(commandArgs, package=R.utils) for a nice way to get the parameter. For an invocation of the form R foo.R you'd need to inspect your system's process table (so don't do that). Hope this helps. Allan On 09/07/10 10:48, Ralf B wrote: Is there a way for a script to find out about its own name ? Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Current script name from R
On 09/07/10 12:18, Ralf B wrote: I am using RGUI, the command line or the StatET Eclipse environment. Should this not all be the same? No, there is no particular reason why they should. Allan Ralf On Fri, Jul 9, 2010 at 7:11 AM, Allan Engelhardtall...@cybaea.com wrote: I'm assuming you are using Rscript (please provide self-contained examples when posting) in which case you could look for the element in (base|R.utils)::commandArgs() that begin with the string --file= - the rest is the file name. See the asValues= parameter in help(commandArgs, package=R.utils) for a nice way to get the parameter. For an invocation of the form R foo.R you'd need to inspect your system's process table (so don't do that). Hope this helps. Allan On 09/07/10 10:48, Ralf B wrote: Is there a way for a script to find out about its own name ? Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how can achive step by step execution of the script
Sounds like you want devAskNewPage(TRUE) or the related options(device.ask.default). See help(devAskNewPage, package=grDevices). Hope this helps. Allan On 09/07/10 13:54, vijaysheegi wrote: Hi R Experts, I have certain code ,i want to achive interactive execution . For ex: 1. as part of input ,it should ask file name or table name as input. 2.in script so many graphs i need to draw,it should wait till certain key is pressed . 3:i am using windows R,rscriptscriptname is not working. Please some one help me. Thanks Experts in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why Rscript behaves strangely with file name starting with 'size'?
It looks like a bug in R, not Rscript: what Rscript does is effectively R --file=size.R and you can easily test that echo 'print(Hello world)' size.R R --file=size.R give you the error while cp size.R asize.R R --file=asize.R works as expected. None of which helps you directly, but perhaps one of the developeRs can help fix what seems to be broken options parsing in the R code. Allan On 08/07/10 04:43, thmsfuller...@gmail.com wrote: Hello All, It seems weird to me that Rscript does the following thing and enters the R prompt mode. If I change the file name to something that doesn't start with 'size', then Rscript runs normally. Does anybody know what is going on? $ Rscript size.R WARNING: '--file=size.R' value is invalid: ignored $ Rscript --version R scripting front-end version 51072 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why Rscript behaves strangely with file name starting with 'size'?
Having a quick look at CommandLineArgs.c (yuck! :-P) and some experimentation shows that a workaround could be Rscript ./size.R or the equivalent R --file=./size.R Hope this helps a little. Allan For developers: I'm /guessing/ the problem is in CommandLineArgs.c around the line else if(strncmp(*av+7, size, 4) == 0) { which looks like a nasty hack that just happens to match --file=size.R as well as --min-vsize=N et al that it tries to detect. But seriously: there are libraries for options parsing! On 08/07/10 07:49, Allan Engelhardt wrote: It looks like a bug in R, not Rscript: what Rscript does is effectively R --file=size.R and you can easily test that echo 'print(Hello world)' size.R R --file=size.R give you the error while cp size.R asize.R R --file=asize.R works as expected. None of which helps you directly, but perhaps one of the developeRs can help fix what seems to be broken options parsing in the R code. Allan On 08/07/10 04:43, thmsfuller...@gmail.com wrote: Hello All, It seems weird to me that Rscript does the following thing and enters the R prompt mode. If I change the file name to something that doesn't start with 'size', then Rscript runs normally. Does anybody know what is going on? $ Rscript size.R WARNING: '--file=size.R' value is invalid: ignored $ Rscript --version R scripting front-end version 51072 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotmath vector problem; full program enclosed
On 07/07/10 06:03, Paul Johnson wrote: [...] Hi, Duncan and David Thanks for looking. I suspect from the comment you did not run the code. The expression examples I give do work fine already. But I have to explicitly put in values like 1.96 to make them work. I'm trying to avid that with substitute, which does work for b2, b3, b4, b5, all but b1. Why just one? I'm uploading a picture of it so you can see for yourself: http://pj.freefaculty.org/R/plotmathwrong.pdf please look in the middle axis. Why does only b1 not work, but the rest do? Because you only had one (as.)expression in your original code, perhaps? This version works for me: ### Filename: plotMathProblem.R ### Paul Johnson July 5, 2010 ### email mepaulj...@ku.edu ### 2010-07-06 AE : Changes. sigma- 10.0 mu- 4.0 myx- seq( mu - 3.5*sigma, mu+ 3.5*sigma, length.out=500) myDensity- dnorm(myx,mean=mu,sd=sigma) ### xpd needed to allow writing outside strict box of graph ### Need big bottom margin to add several x axes par(xpd=TRUE, ps=10, mar=c(18,2,2,2)) plot(myx, myDensity, type=l, xlab=, ylab=Probability Density , main=myTitle1, axes=FALSE) axis(2, pos= mu - 3.6*sigma) axis(1, pos=0) lines(c(myx[1],myx[length(myx)]),c(0,0)) ### closes off axes addInteriorLine- function(x, m, sd){ for (i in 1:(length(x))){ lines( c(x[i],x[i]), c(0, dnorm(x[i],m=m,sd=sd)), lty= 14, lwd=.2) } } dividers- c(qnorm(0.025), -1, 0, 1, qnorm(0.975)) addInteriorLine(mu+sigma*dividers, mu,sigma) # bquote creates an expression that text plotters can use t1- bquote( mu== .(mu)) mtext(bquote( mu == .(mu)), 1, at=mu, line=-1) addInteriorLabel- function(pos1, pos2, m, s){ area- abs(100*( pnorm(m+pos1*s,m,s)-pnorm(m+pos2*s, m,s))) mid- m+0.5*(pos1+pos2)*s text(mid, 0.5*dnorm(mid,m,s),label=paste(round(area,2),%)) } addInteriorLabel(dividers[1],dividers[2], mu, sigma) addInteriorLabel(dividers[2],dividers[3], mu, sigma) addInteriorLabel(dividers[3],dividers[4], mu, sigma) addInteriorLabel(dividers[4],dividers[5], mu, sigma) ### Following is problem point: axis will ### end up with correct labels, except for first point, ### where we end up with b1 instead of mu - 1.96*sigma. b1- substitute( mu - d*sigma, list(d=*-round(dividers[1],2))* ) b2- substitute( mu - sigma ) b3- substitute( mu ) b4- substitute( mu + sigma ) b5- substitute( mu + d*sigma, list(d=round(dividers[5],2)) ) ## plot(-20:50,-20:50,type=n,axes=F) axis(1, line=4,at=mu+dividers*sigma, labels=*as.expression(c(b1,b2,b3,b4,b5))*, padj=-1) ### This gets right result but have to hard code the dividers b1- expression( mu - 1.96*sigma ) b2- expression( mu - sigma ) b3- expression( mu ) b4- expression( mu + sigma ) b5- expression( mu + 1.96*sigma ) axis(1, line=8,at=mu+dividers*sigma, labels=c(b1,b2,b3,b4,b5), padj=-1) Allan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grayscale wireframe??
A standalone example is always appreciated (cf. the posting guide) but try and see if help(gray.colors, package=grDevices) is the sort of thing you are looking for. Hope this helps Allan On 06/07/10 23:30, Marlin Keith Cox wrote: I need grayscale formatting for a wireframe. The only col.regions that I can find are color palettes are all colored: rainbow(n, s = 1, v = 1, start = 0, end = max(1,n - 1)/n, gamma = 1, alpha = 1) heat.colors(n, alpha = 1) terrain.colors(n, alpha = 1) topo.colors(n, alpha = 1) cm.colors(n, alpha = 1) The code follows: X11() library(lattice) par(family=serif, cex=1.2) wireframe(z ~ y*x, mat.df, drape = TRUE,col.regions=rainbow(100), zlab = list(Water mass error (%),rot=90), zlim=c(-50,180), xlab = list(Resistance error (%),rot=-9), ylab = list(Length error (%),rot=38), scales = list(arrows = FALSE), screen = list(z = -35, x = -77, y = 10)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotmath vector problem; full program enclosed
Ooops, I didn't convert this one to text right for the list. b1- substitute( mu - d*sigma, list(d=*-round(dividers[1],2))* ) should be b1- substitute( mu - d*sigma, list(d=-round(dividers[1],2)) ) and similarly for labels=*as.expression(c(b1,b2,b3,b4,b5))*, padj=-1) read labels=as.expression(c(b1,b2,b3,b4,b5)), padj=-1) Apologies Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Visualization of coefficients
Thanks Tal. Nice summary on the web page. I think the last example would be even better if it was a stand-alone piece of code (i.e.: with data) that we could run. For example library(arm) data(Mroz, package = car) M1- glm(lfp ~ ., data = Mroz, family = binomial) M2- bayesglm(lfp ~ ., data = Mroz, family = binomial) M3- glm(lfp ~ ., data = Mroz, family = binomial(probit)) coefplot(M2, xlim=c(-2, 6),intercept=TRUE) coefplot(M1, add=TRUE, col.pts=red, intercept=TRUE) coefplot(M3, add=TRUE, col.pts=blue, intercept=TRUE, offset=0.2) Allan On 07/07/10 09:16, Tal Galili wrote: Hello David, Thanks to your posting I started looking at the function in the arm package. It appears this function is quite mature, and offers (for example) the ability to easily overlap coefficients from several models. I updated the post I published on the subject, so at the end of it I give an example of comparing the coef of several models: http://www.r-statistics.com/2010/07/visualization-of-regression-coefficients-in-r/ Thanks again for the pointer. Best, Tal [...] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wavelet
Try RSiteSearch(MORLET) before you post. Allan On 07/07/10 09:38, nuncio m wrote: Hi useRs, Is it possible to get MORLET wavelet in R Thanks nuncio __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotmath vector problem; full program enclosed
On 07/07/10 18:52, Paul Johnson wrote: [...] 1: axis(1, line=6, at=mu+dividers*sigma, labels=as.expression(c(b1,b2,b3,b4,b5), padj=-1)) 2: axis(1, line=9, at=mu+dividers*sigma, labels=c(as.expression(b1),b2,b3,b4,b5), padj=-1) This second one shouldn't work, I think. It has as.expression on only the first element, and yet they all come out right. Is there a spill over effect? You can call it spill-over if you want: c() makes a *vector*, not a list, so all elements have to be of the same type and they are all coerced to the more general type (the first). It is similar to c(1, 2, 3) [1] 1 2 3 which is different from c(1, 2, 3). Hope this helps Allan PS: BTW, I think the () are wrong in example 1? :-) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selection with changing number of columns
I'm not sure your question completely makes sense, but perhaps this will help: set.seed(1) d- data.frame(matrix(floor(runif(100, max=10)), 10)) # Example data d[apply(d == 9, 1, any), ] # Select rows with 9 in any column ## Or more generally: d[ , c(1, 2, 3)] == c(2, 2, 9) # Or maybe d == 0:9 to select on all columns apply(d[ , c(1, 2, 3)] == c(2, 2, 9), 1, any) # any() being the general `|` function here d[apply(d[ ,c(1, 2, 3)] == c(2, 2, 9), 1, any), ] # Finally: the rows we were looking for A bit over-engineered, perhaps, but gets you to use some generally useful functions. Hope this helps. Allan On 06/07/10 09:33, Kunzler, Andreas wrote: Dear list, I'm looking for a way to select rows of a data.frame with changing number of columns (constraint) involved. Assume a data (d) structure like Var.1 Var.2 Var.3 9 2 1 2 9 5 1 2 1 I know the number of involved columns. Is there a way to generate the following selection automatically (maybe for loop), so that it makes no difference if there are two or ten columns involved. Selection: d[d$Var.1==9 | d$Var.1==9 | d$Var.1==9 ,] Does anybody know a way? Thanks Mit freundlichen Grüßen Andreas Kunzler Bundeszahnärztekammer (BZÄK) Chausseestraße 13 10115 Berlin Tel.: 030 40005-113 Fax: 030 40005-119 E-Mail: a.kunz...@bzaek.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pseudo F statistics with index.G1
Always use set.seed in your examples. Running your code after set.seed(1) works fine but gives the error after set.seed(2). The problem in the latter case being that there is only one value 7 in C and you need two or more for the index.G1 code to make sense. Hope this helps a little Allan Example: set.seed(1) mat- diag(1,27,27) f- runif(sum(26:1),0,1) mat[lower.tri(mat)]- f mat- t(mat) mat[lower.tri(mat)]- f # Cluster with Agnes A- agnes(mat,diss=T,method=average) C- cutree(A,k=7) # Value of k = the number of clusters F- index.G1(mat,C) table(C) C 1 2 3 4 5 6 7 6 3 5 2 3 4 4 set.seed(2) mat- diag(1,27,27) f- runif(sum(26:1),0,1) mat[lower.tri(mat)]- f mat- t(mat) mat[lower.tri(mat)]- f # Cluster with Agnes A- agnes(mat,diss=T,method=average) C- cutree(A,k=7) # Value of k = the number of clusters F- index.G1(mat,C) Error in apply(x[cl == i, ], 2, mean) : dim(X) must have a positive length table(C) C 1 2 3 4 5 6 7 8 4 5 3 4 2 1 On 06/07/10 09:32, Kennedy wrote: Hello, I have done some clustering with Agnes and want to calculate the pseudo F statistics using index.G1. It works for a low number of clusters but when i increase the number of clusters i eventually get the following message: Error in apply(x[cl == i, ], 2, mean) : dim(X) must have a positive length The following code produces an example comparable to the data i am clustering: library(cluster) library(ade4) library(R2HTML) library(e1071) library(class) library(rgl) library(MASS) library(clusterSim) # Create a symmetric matrix with ones on the diagonal mat- diag(1,27,27) f- runif(sum(26:1),0,1) mat[lower.tri(mat)]- f mat- t(mat) mat[lower.tri(mat)]- f # Cluster with Agnes A- agnes(mat,diss=T,method=average) C- cutree(A,k=7) # Value of k = the number of clusters F- index.G1(mat,C) The code above works for k=2:6 but then the error message appears. Sincerely Henrik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pseudo F statistics with index.G1
I guess that in R you have to be explicit about what you want to do. You can't just drop them, so you'll have to assign them some (other) value. Try which(table(C)==1) to give you the values you need to change and then decide what to change them to. The SAS documentation may tell you what it does; otherwise, I'd suggest you look at the source, but I guess that is not an option for you. Welcome to R. Allan On 06/07/10 15:48, Henrik Aldberg wrote: Thank you Allan, As i understand it the index.G1 function does not work if one of the clusters in the partition only contains one object. Is there a way to get around this in R? In SAS the PSF function seems to ignore the presence of singleton clusters. Sincerely Henrik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotmath vector problem; full program enclosed
On 06/07/10 18:51, David Winsemius wrote: Easily addressed in this case with ~ instead of -. The value of d provides the minus: b1 - substitute( mu ~ d*sigma, list(d=round(dividers[1],2)) ) Neat trick! But it gives a slightly different minus sign in the display, so perhaps simply b1- substitute( mu - d*sigma, list(d=-round(dividers[1],2)) ) instead? Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditional dataframe search and find
ct.df[ct.df$conc 25,time] [1] 10 11 12 13 14 15 ct.df[ct.df$conc 25,time][1] df[df$conc 25,time][1] [1] 10 See also help(order) if conc is not ordered. On 02/07/10 22:50, oscar linares wrote: time conc 1 0 164.495456 2 1 133.671185 3 2 108.622975 4 3 88.268468 5 4 71.728126 6 5 58.287225 7 6 47.364971 8 7 38.489403 9 8 31.276998 109 25.416103 11 10 20.653462 12 11 16.783276 13 12 13.638313 14 13 11.082674 15 14 9.005927 16 15 7.318336 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems loading the twitteR package
If you are on a Unix machine try your package manager first to see if it has the packages you need. Otherwise, try install.package(twitteR, dependencies = TRUE) (or RCurl) and see if that helps. Allan On 01/07/10 22:51, lyolya wrote: Dear all, I cannot load the twitteR package. When I have it installed and try to load it, I get the following response from the system: library(twitteR) Loading required package: RCurl Error: package 'RCurl' could not be loaded Además: Mensajes de aviso perdidos In library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc = lib.loc) : there is no package called 'RCurl' and when I further manually install and load the RCurl package, I get this response: ERROR: dependencies ‘bitops’ are not available for package ‘RCurl’ * removing ‘/Users/lyolya/RCurl’ Would be happy with any piece of advice, Thank you. Lyolya. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Visualization of coefficients
http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=114 should get you started. On 02/07/10 07:10, Wincent wrote: Dear all, I try to show a subset of coefficients in my presentation. It seems that a standard table is not a good way to go. I found figure 9 (page 9) in this file ( http://www.destatis.de/jetspeed/portal/cms/Sites/destatis/Internet/DE/Content/Wissenschaftsforum/Kolloquien/VisualisierungModellierung__Beitrag,property=file.pdf ) looks pretty good. I wonder if there is any function for such plot? Or any suggestion on how to present statistical models in a presentation? Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with aggregating data across time points
On 02/07/10 16:21, Chris Beeley wrote: Hello- I have a dataset which basically looks like this: Location Sex Date Time VerbalSelf harm Violence_objects Violence A 1 1-4-2007 1800 3 0 1 3 A 1 1-4-2007 1230 21 2 4 D 2 2-4-2007 1100 04 0 0 ... I've put a dput of the first section of the data at the end of this email. [...] What I want to do is: A) sum each of the dependent variables for each of the dates (so e.g. in the example above for 1-4-2007 it would be 3+2=5, 0+1=1, 1+2=3, and 3+4=7 for each of the variables) If 'data' is the data at the end of your email, then aggregate(cbind(verbal,self.harm,violence_objects,violence) ~ Date, data = data, FUN = sum) Date verbal self.harm violence_objects violence 1 01/04/07 251539 2 02/04/07 24 68 13 3 03/04/07 17130 10 is one approach. Read help(aggregate) and don't forget the na.action= argument. B) do this sum, but only in each location this time (location is the first variable)- so the sum for 1-4-2007 in location A, sum for 1-4-2007 in location B, and so on and so on. Because this is divided The basic approach could be aggregate(cbind(verbal,self.harm,violence_objects,violence) ~ Date + Location, data = data, FUN = sum) Date Location verbal self.harm violence_objects violence 1 01/04/07A 7 103 2 02/04/07A 8 201 3 03/04/07A 0 002 4 01/04/07B 3 201 5 02/04/07B 4 200 6 03/04/07B 4 003 7 01/04/07C 4 232 8 02/04/07C 0 042 9 03/04/07C 1 105 10 01/04/07D 7 603 11 02/04/07D 0 009 12 03/04/07D 41100 13 01/04/07E 4 300 14 02/04/07E 4 040 15 03/04/07E 8 100 16 01/04/07F 0 100 17 02/04/07F 8 201 across locations, some dates will have no data going into them and will return 0 sums. Crucially I still want these dates to appear- so e.g. 21-5-2008 would appear as 0 0 0 0, then 22-5-2008 might have 1 2 0 0, then 23-5-2008 0 0 0 0 again, and etc. Why? But variations on data2- data[!(as.numeric(data$Date)==3 data$Location==B),] # For example z- with(data2, tapply(verbal, list(Date,Location), FUN=sum)) z[is.na(z)]- 0 print(z) A B C D E F 0 0 0 0 0 0 0 01/04/07 0 7 3 4 7 4 0 02/04/07 0 8 0 0 0 4 8 03/04/07 0 0 4 1 4 8 0 will perhaps work for you. Hope this helps Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot map Brazil and documentation
On 01/07/10 05:43, Pablo Cerdeira wrote: [...] *My doubt is: what it means with see the package index? It is cryptic statistician speak for help(package = maps) Should I find a Brazilian map on the index?* No. * *If not, can someone help me to find some brazilian map (with states)?* The basic map is map('world', 'brazil') For the states you'll have to find a map to download or buy. Try http://www.inde.gov.br/ or Google. *Also, can someone suggest me some good site with documentation about plotting graphics in R? I'd like to read more about this amazing tool.* The site http://addictedtor.free.fr/graphiques/ is often mentioned for graphis. For more, try the links from www.r-project.org http://cran.r-project.org/manuals.html http://www.r-project.org/doc/bib/R-books.html http://www.r-project.org/other-docs.html Hope this helps Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave function
In general, use cat (see help(cat)) for printing your output, e.g. echo=FALSE= cat(levels(mydata$Firms)[mydata$Firms], \n) cat(mydata$Year, \n) # etc. @ For your specific case you may also be interested in help(xtable, package=xtable) along with the ...,results=tex= Sweave construct. For an R version of the output you want, a starting point may be for (f in levels(mydata$Firms)) { cat(f, \n) print(t(mydata[mydata$Firms == f, 2:5])) } which gives electrolux 123 Year 1995 1996 1997 IIA 100 340 35 IIB45 67 99 IIC65 97 31 fiat 456 Year 1995 1996 1997 IIA76 567 453 IIB34 35 89 IIC45 66 37 Hope this helps a little. Allan On 01/07/10 09:08, n.via...@libero.it wrote: Dear list, I have a question about the interaction between R code and Latex language trough the Sweave function in the package utils. What I'm trying to do is to write a report. Contrary to the examples shown in the Sweave Manual in which table already constructed by R are exported on Latex files, what I would like to do is to build a table in which I combine text and specific columns of my data frame. I will give you the following example to be much more clear. Suppose I have a data frame like this: FirmsYear IIA IIB IIC electrolux1995100 45 65 electrolux1996340 67 97 electrolux199735 99 31 fiat1995 76 34 45 fiat1996567 35 66 fiat1997453 89 37 Where IIA is the turnover of the firm, IIB is the production and IIC is the cost of labour. I would like to get a table like this in the latex format this: Firms 1 electrolux --- variables 1995 19971997 -- turnover 100 34035 production 4567 99 cost of labour 6597 31 I use the following code: \documentclass[a4paper]{article} \title{example} \begin{document} \maketitle echo=F= mydata$firms @ variables echo=F= mydata$year @ and so on. I have two problem, first I'm not able to put on the same line text and output of R. So on my Latex document I get for example variable 1995 1996 1997 and I don't want this. The secondo problem is that at the beginnig of the R output I get the index, namely variable [1] 1995 1996 1997 and I don't want to see it. Anyone Knows how to do it or if there is another package in R that give me the possibility to create a Report by constructing table without any problems??? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Export data frame of high dimension to txt
On 01/07/10 08:50, Loukia Spin. wrote: I managed with success to export a data frame of 367 columns and 37 rows to a txt file, but unfortunately I couldn't manage the same with the transposed data frame. Specifically, it seems like the notebook cannot read correctly 367 columns, so it reads the first 145 columns and it folds down the rest of them and consequently it folds down the respective data. f=format.data.frame(rev,trim=F,digits=NULL,nsmall=3, ## rev is the data frame of 37 rows and 367 columns ## justify=centre,zero.print=F,drop0trailing=T);f I am not sure why you want to call format.data.frame first? A simple write.table(rev, ...) should work fine. As it stands, write.table will convert f back to a data.frame before writing write.table(f,file=C:/Program Files/R/R-2.11.0/genes as rows.txt, row.names=T,sep=\t,quote=F,dec=.,col.names=NA,append=T) Careful with that append=TRUE. Try it with FALSE (the default) to make sure the file really is empty. It might simply be text at the beginning that trips up the subsequent read.table (which I am guessing from your description is where the problem occurs). Hope this helps a little. Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stacked Restricted Boltzmann Machine
I don't know of a native implementation, but you could probably use MDP along with RPy or R/S Plus as suggested here: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2628591/ Allan. On 01/07/10 09:01, Matthew OKane wrote: Hi, Are there any implementations of stacked RBMs either complete or planned in R? Thanks, Matthew [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.