Re: [R] as.matrix does not turn data frame into character matrix
At least in R 1.8.0 it is a list matrix: class(jank) [1] matrix typeof(jank) [1] list The help page is not quite correct, as it does not mention what happens when you create a data frame with a *list* as a column. How did you create this data frame? Columns cortrange and logcortrange are surely intended to be numeric columns, and it is pretty moot point if it is a valid data frame (it should not be possible to create it with data.frame, for example, and do.call(data.frame, unclass(junk)) fails). Please note that R 1.8.0 is current: we are nowhere near` 7.0 or `7.0'. A lot of error-checking/correction in the data.frame area was added in 1.8.0. On Thu, 16 Oct 2003, Jacob Wegelin wrote: The as.matrix function behaves in a puzzling manner. The help file says: `as.matrix' is a generic function. The method for data frames will convert any non-numeric column into a character vector using `format' and so return a character matrix. But this does not appear to be the case in the following example. Instead, as.matrix turns a data.frame into a list, not a character matrix, which wreaks havoc with my old code. junk- structure(list(SUBNUM = structure(c(3, 4, 5, 7, 6), class = factor, .Label = c(01, 02, 03, 04, 07, 08, 09, 10, 11, 12, 13, 16, 17, 18, 21, 22, 23, 24, 25, 26, 27, 28)), AGE = c(7, 7, 10, 8, 5), DIAGNOSI = c(1, 1, 1, 1, 1), cortrange = structure(list( 03 = 19.0674, 04 = 40.3009, 07 = 37.0205, 09 = 8.84131, 08 = 10.9855), .Names = c(03, 04, 07, 09, 08 )), logcortrange = structure(list(03 = 1.90097866386896, 04 = 2.75040785570225, 07 = 3.15633470025647, 09 = 2.56744094585387, 08 = 2.84160608522206), .Names = c(03, 04, 07, 09, 08))), .Names = c(SUBNUM, AGE, DIAGNOSI, cortrange, logcortrange), row.names = c(03, 04, 07, 09, 08), class = data.frame) junk SUBNUM AGE DIAGNOSI cortrange logcortrange 03 03 71 19.0674 1.900979 04 04 71 40.3009 2.750408 07 07 101 37.0205 3.156335 09 09 81 8.84131 2.567441 08 08 51 10.9855 2.841606 jank-as.matrix(junk) jank SUBNUM AGE DIAGNOSI cortrange logcortrange 03 03 7 119.0674 1.900979 04 04 7 140.3009 2.750408 07 07 10 137.0205 3.156335 09 09 8 18.84131 2.567441 08 08 5 110.9855 2.841606 Notice that the first column is character, whereas the other columns are plain numeric! This is *not* a matrix of character. dput(jank) structure(list(03, 04, 07, 09, 08, 7, 7, 10, 8, 5, 1, 1, 1, 1, 1, 19.0674, 40.3009, 37.0205, 8.84131, 10.9855, 1.90097866386896, 2.75040785570225, 3.15633470025647, 2.56744094585387, 2.84160608522206), .Dim = c(5, 5), .Dimnames = list(c(03, 04, 07, 09, 08), c(SUBNUM, AGE, DIAGNOSI, cortrange, logcortrange))) Is this a bug? One result of this: cbind(dimnames(jank)[[1]], jank) Error in cbind(...) : cannot create a matrix from these types (I'm using version 7.0, because the links for downloading version 7.1 are dead today:) version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major1 minor7.0 year 2003 month04 day 16 language R Thanks for any information. Jake __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] don't display rulers in image() command and script file input
On Sat, 18 Oct 2003, Ernie Adorio wrote: Dear R experts, 1. How can I turn off the display of rulers in image() command? Set axes=FALSE in image(). In this case - as in many others - running example() - here example(image) - on the function does point in the right direction. The function help files, and especially their examples, are places where much wisdom is to be found! 2. Rather than typing my commands at the command line, how can I input a file contents aside from doing a copy and paste operation? source() the file. On Windows, there is a menu item Source R code under File, but sourcing from the command line lets you change arguments. On Windows you may find source(file.choose(), echo=TRUE) useful - but perhaps turn off buffered output in the Misc menu first - otherwise the console stays blank until the whole sourced file is completed. On other systems console output is not usually buffered. Roger Bivand Thanks in advance, Ernesto Adorio Math Department University of the Philippines Diliman __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] help with legend()
PaulSch == Schwarz, Paul [EMAIL PROTECTED] on Wed, 15 Oct 2003 12:09:11 -0700 writes: PaulSch I am converting some S-PLUS scripts that I use for PaulSch creating manuscript figures to R so that I can take PaulSch advantage of the plotmath capabilities. In my PaulSch S-PLUS scripts I like to use the key() function for PaulSch adding legends to plots, AFAIK key() in S+ is from the trellis library section. The corresponding R package, trellis, has a draw.key() function that may work similarly to S-plus' key() {Deepayan ?}. PaulSch and I have a couple of PaulSch questions regarding using the legend() function in PaulSch R. PaulSch 1) is there a way to specify different colors for PaulSch the legend vector of text values? not yet in legend() -- but see below PaulSch 2) is there a way to reverse the order of the PaulSch legend items so that the text values precede the PaulSch symbols? not yet in legend() --- but it's an open source project living from community support ... Can S+ key() do these two things? If yes, how do you specify it there {this sounds as if I was willing to consider adding these wished features to legend } PaulSch Thanks for your time and patience. You're welcome, Martin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] indexing a particular element in a list of vectors
Richard A. O'Keefe [EMAIL PROTECTED] writes: Scott Norton [EMAIL PROTECTED] wrote: I have a list of character vectors. I'm trying to see if there is a way (in a single line, without a loop) to pull out the first element of all the vectors contained in the list. You have a list. You want to do something to each element. See ?lapply u - c(Fee,fie,foe,fum) v - c(Ping,pong,diplomacy) w - c(Hi,fi) x - list(a=u, b=v, c=w) lapply(x, function (cv) cv[1]) ... If you want the result as a character vector, see ?sapply sapply(x, function (cv) cv[1]) a b c Fee Ping Hi Or even sapply(x, [, 1) a b c Fee Ping Hi (same thing with lapply) -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] plot with dates on x axis, how to fix the number of days betwenn tick marks ?
Hi, Following plot is displaying fine, (starting arount the 10. september), except that the xaxp parameter has no effect. I'd like to have a tickmark every 7 days... plot(timeline, subset(myd, TYPE==A)$list1, ylim=c(100*floor(min(subset(myd, TYPE==A)$list1)/100-1), 100*ceiling(max(subset(myd, TYPE==A)$list1)/100+1)), xlim=c(106350, Sys.time()), xaxp=c(106350, Sys.time(),7*24*3600), type=o, col=blue,ylab=) Thanks for your hints. Marc Mamin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Problems with crossprod
Dear R-users, I found a strange problem working with products of two matrices, say: a - A[i, ] ; crossprod(a) where i is a set of integers selecting rows. When i is empty the result is in a sense random. After some trials the right answer (a matrix of zeros) appears. --- Illustration R : Copyright 2003, The R Development Core Team Version 1.8.0 (2003-10-08) A -matrix(0, 5, 5) i - c() a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] 6.578187e-313 NaN NaN NaN NaN [2,] NaN 1.273197e-313 NaN 1.485397e-313 NaN [3,] NaN 4.243992e-313 2.121996e-314 NaN NaN [4,] NaN 1.697597e-313 NaN 4.880590e-313 NaN [5,] 5.941588e-313 NaN NaN 1.697597e-313 NaN a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] 2.121996e-314 5.729389e-313 NaN NaN NaN [2,] NaN NaN NaN NaN 1.909796e-313 [3,] 2.970794e-313 NaN NaN NaN NaN [4,] NaN NaN NaN 8.487983e-314 NaN [5,] NaN 6.365987e-313 2.546395e-313 NaN NaN a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] NaN 1.485397e-313 NaN NaN 2.970794e-313 [2,] 3.182994e-313 NaN NaN 1.060998e-313 NaN [3,] NaN NaN NaN 1.697597e-313 2.737375e-312 [4,] NaN NaN NaN NaN 2.048394e+10 [5,] NaN NaN NaN NaN 2.970794e-313 a - A[i, ] ; crossprod(a) [,1][,2] [,3] [,4] [,5] [1,] 1.591383e-266 20489834629000 [2,] 5.031994e-266 0000 [3,] 1.591205e-266 0000 [4,] 1.264128e-267 0000 [5,] 1.037656e-311 0000 a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 --- End of illustration The same problem does not appear using the matrix product: a - A[i, ] ; t(a) %*% a [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 Note that Splus 6 returns an error message: a - A[i, ] ; crossprod(a) Problem in .Fortran.ok.Internal(if(cmplx) zcrossp1..: subroutine dcrossp1: Argument 1 has zero length Thank you, Giovanni __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] R memory and CPU requirements
I agree completely. In fact, I have about 5000 observations, which should be enough. I was using 200 samples because of RAM limitations and I'm afraid to think about what amount of RAM I'll need to fit an aov() for such data. --- John Fox [EMAIL PROTECTED] wrote: Dear Alexander, If I understand you correctly, you have a sample of 200 observations. Even if you had only two factors with 40 levels each, the main effects and interactions of these factors would require about 1600 degrees of freedom -- that is, more than the number of observations. This doesn't make a whole lot of sense. I hope that this helps, John At 05:03 PM 10/16/2003 -0700, Alexander Sirotkin \[at Yahoo\] wrote: --- Deepayan Sarkar [EMAIL PROTECTED] wrote: On Thursday 16 October 2003 17:59, Alexander Sirotkin \[at Yahoo\] wrote: Thanks for all the help on my previous questions. One more (hopefully last one) : I've been very surprised when I tried to fit a model (using aov()) for a sample of size 200 and 10 variables and their interactions. That doesn't really say much. How many of these variables are factors ? How many levels do they have ? And what is the order of the interaction ? (Note that for 10 numeric variables, if you allow all interactions, then there will be a 100 terms in your model. This increases for factors.) In other words, how big is your model matrix ? (See ?model.matrix) Deepayan I see... Unfortunately, model.matrix() ran out of memory :) I have 10 variables, 6 of which are factor, 2 of which have quite a lot of levels (about 40). And I would like to allow all interactions. I understand your point about categorical variables, but still - this does not seem like too much data to me. I remmeber fitting all kinds of models (mostly decision trees) for much, much larger data sets. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help - John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: [EMAIL PROTECTED] phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox - __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] R memory and CPU requirements
--- Deepayan Sarkar [EMAIL PROTECTED] wrote: On Thursday 16 October 2003 19:03, Alexander Sirotkin \[at Yahoo\] wrote: Thanks for all the help on my previous questions. One more (hopefully last one) : I've been very surprised when I tried to fit a model (using aov()) for a sample of size 200 and 10 variables and their interactions. That doesn't really say much. How many of these variables are factors ? How many levels do they have ? And what is the order of the interaction ? (Note that for 10 numeric variables, if you allow all interactions, then there will be a 100 terms in your model. This increases for factors.) In other words, how big is your model matrix ? (See ?model.matrix) Deepayan I see... Unfortunately, model.matrix() ran out of memory :) I have 10 variables, 6 of which are factor, 2 of which have quite a lot of levels (about 40). And I would like to allow all interactions. I understand your point about categorical variables, but still - this does not seem like too much data to me. That's one way to look at it. You don't have enough data for the model you are trying to fit. The usual approach under these circumstances is to try 'simpler' models. Please try to understand what you are trying to do (in this case by reading an introductory linear model text) before blindly applying a methodology. Deepayan I did study ANOVA and I do have enough observations. 200 was only a random sample of more then 5000 which I think should be enough. However, I'm afraid to even think about amount of RAM I will need with R to fit a model for this data. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] sort charcters in W2K and NT
Hello. I have a problem using sort() in windows 2000 and windows NT 4.0, running R 1.8.0 on both. I want to sort a vector of characters names, where I have used Scandinavian letters, like 'Æ', 'Ø', and 'Å' (for those who cannot display these letters this question seems rather meaningless, i guess). Windows 2000 sorts the vector like I am used to from other software, with 'Å' as the last letter in the alphabet, while windows NT has Å just after A, and Ø following O. Is there a way to solve this problem (other than replace the Scandinavian letters)? A short example: sort(c('a','p','å')) # on windows 2000: [1] a p å # on windows NT [1] a å p Thanks in advance Ivar Herfindal On windows 2000: version _ platform i386-pc-mingw32 arch i386 os mingw32system i386, mingw32 status major1 minor8.0 year 2003 month10 day 08 language R On windows NT: version _ platform i386-pc-mingw32 arch i386 os mingw32system i386, mingw32 status major1 minor8.0 year 2003 month10 day 08 language R __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Query: colouring graph
Hi! How can I fill with colors a portion of a graph (e.g.: I want fill in red the area within two confidence intervals)? Thank you very much Cristian ~~ Cristian Pattaro ~~ Unit of Epidemiology Medical Statistics University of Verona Tel +39 045 8027668 fax +39 045 505357 [EMAIL PROTECTED] ~~ - Biometria - biometria.univr.it __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Problems with crossprod
I still have R 1.7.1, and the problem appears there as well: a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] 1.195616e-301 7.042305e-302 9.563047e-302 2.281448e-302 2.198017e-302 [2,] 6.905419e-302 1.204915e-301 3.382433e-302 2.398701e-302 2.358828e-302 [3,] 1.194968e-301 7.039991e-302 1.384628e-302 2.446584e-302 4.507199e-302 [4,] 7.046400e-302 1.204416e-301 2.444003e-302 2.357136e-302 2.446228e-302 [5,] 1.205363e-301 1.204367e-301 2.445605e-302 3.979963e-302 7.861951e-302 a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,]00 5.092790e-313 4.372522e-1110 [2,]00 5.708624e-307 0.00e+000 [3,] NaN0 4.516565e-300 0.00e+000 [4,]00 8.997262e-312 0.00e+000 [5,]00 1.086462e-311 0.00e+000 hope this helps. spencer graves Giovanni Marchetti wrote: Dear R-users, I found a strange problem working with products of two matrices, say: a - A[i, ] ; crossprod(a) where i is a set of integers selecting rows. When i is empty the result is in a sense random. After some trials the right answer (a matrix of zeros) appears. --- Illustration R : Copyright 2003, The R Development Core Team Version 1.8.0 (2003-10-08) A -matrix(0, 5, 5) i - c() a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] 6.578187e-313 NaN NaN NaN NaN [2,] NaN 1.273197e-313 NaN 1.485397e-313 NaN [3,] NaN 4.243992e-313 2.121996e-314 NaN NaN [4,] NaN 1.697597e-313 NaN 4.880590e-313 NaN [5,] 5.941588e-313 NaN NaN 1.697597e-313 NaN a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] 2.121996e-314 5.729389e-313 NaN NaN NaN [2,] NaN NaN NaN NaN 1.909796e-313 [3,] 2.970794e-313 NaN NaN NaN NaN [4,] NaN NaN NaN 8.487983e-314 NaN [5,] NaN 6.365987e-313 2.546395e-313 NaN NaN a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] NaN 1.485397e-313 NaN NaN 2.970794e-313 [2,] 3.182994e-313 NaN NaN 1.060998e-313 NaN [3,] NaN NaN NaN 1.697597e-313 2.737375e-312 [4,] NaN NaN NaN NaN 2.048394e+10 [5,] NaN NaN NaN NaN 2.970794e-313 a - A[i, ] ; crossprod(a) [,1][,2] [,3] [,4] [,5] [1,] 1.591383e-266 20489834629000 [2,] 5.031994e-266 0000 [3,] 1.591205e-266 0000 [4,] 1.264128e-267 0000 [5,] 1.037656e-311 0000 a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 --- End of illustration The same problem does not appear using the matrix product: a - A[i, ] ; t(a) %*% a [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 Note that Splus 6 returns an error message: a - A[i, ] ; crossprod(a) Problem in .Fortran.ok.Internal(if(cmplx) zcrossp1..: subroutine dcrossp1: Argument 1 has zero length Thank you, Giovanni __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] heatmap function
Hi all, By default, the heatmap function gives an image with a dendrogram added to the left side and to the top. Is it possible to only add the dendrogram to the left side and let the order of the columns unchanged ? I tried heatmap(mat, col=rbg,Rowv=res.hclust$order,Colv=1:dim(mat)[[2]]). In this case, the order of the columns are unchanged but a dendrogram is added to the top. How can I avoid it? Thanks, Oiliver -- - Martin Olivier INRA - Unité protéomique LIRMM - IFA/MAB 2, Place Viala 161, rue Ada 34060 Montpellier Cédex 1 34392 Montpellier Cédex 5 Tel : 04 99 61 27 01 Tel : O4 67 41 86 71 [EMAIL PROTECTED] [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Query: colouring graph
[EMAIL PROTECTED] wrote: Hi! How can I fill with colors a portion of a graph (e.g.: I want fill in red the area within two confidence intervals)? You can construct the coordinates of the polygon that fills this region, then use 'polygon' to fill it. Here: # first set up the plot - we want the density over the polygon, # so make a blank plot of the right size: x - seq(-3,3,len=100) plot(x,dnorm(x),type='n') # a little function that draws a filled polygon between limits under # dnorm(x) The polygon has to go from the axis, up, along the curve, # then back down again. fillDnorm - function(low,high,col=red,n=100){ x - seq(low,high,len=n) y - dnorm(x) x - c(x[1],x,x[length(x)]) y - c(0,y,0) polygon(x,y,col=col,border=NA) } # fill between -2 and -1 fillDnorm(-2,-1) # now add the density lines(x,dnorm(x)) Tweak as required. Baz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] sort charcters in W2K and NT
Ivar Herfindal wrote: Hello. I have a problem using sort() in windows 2000 and windows NT 4.0, running R 1.8.0 on both. I want to sort a vector of characters names, where I have used Scandinavian letters, like 'Æ', 'Ø', and 'Å' (for those who cannot display these letters this question seems rather meaningless, i guess). Windows 2000 sorts the vector like I am used to from other software, with 'Å' as the last letter in the alphabet, while windows NT has Å just after A, and Ø following O. Is there a way to solve this problem (other than replace the Scandinavian letters)? A short example: sort(c('a','p','å')) # on windows 2000: [1] a p å # on windows NT [1] a å p Thanks in advance ?sort tells us: The sort order for character vectors will depend on the collating sequence of the locale in use: see Comparison. and ?Comparison points you to ?locales which gives an example: Sys.setlocale(LC_COLLATE, C) # turn off locale-specific sorting Uwe Ligges Ivar Herfindal On windows 2000: version _ platform i386-pc-mingw32 arch i386 os mingw32system i386, mingw32 status major1 minor8.0 year 2003 month10 day 08 language R On windows NT: version _ platform i386-pc-mingw32 arch i386 os mingw32system i386, mingw32 status major1 minor8.0 year 2003 month10 day 08 language R __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] indexing a particular element in a list of vectors
Or do.call(cbind,x)[1,] which of course makes a whole new copy of x and gives you a nasty warning as well, but does not use a conceptual `for` loop. Which I think was the original question, to which AFAIK the answer is no, there is no easy subscripting construct such as x[[1:3]][1] that will do what was asked. -Original Message- From: Peter Dalgaard [mailto:[EMAIL PROTECTED] Sent: 17 October 2003 08:48 To: Richard A. O'Keefe Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [R] indexing a particular element in a list of vectors Security Warning: If you are not sure an attachment is safe to open please contact Andy on x234. There are 0 attachments with this message. Richard A. O'Keefe [EMAIL PROTECTED] writes: Scott Norton [EMAIL PROTECTED] wrote: I have a list of character vectors. I'm trying to see if there is a way (in a single line, without a loop) to pull out the first element of all the vectors contained in the list. You have a list. You want to do something to each element. See ?lapply u - c(Fee,fie,foe,fum) v - c(Ping,pong,diplomacy) w - c(Hi,fi) x - list(a=u, b=v, c=w) lapply(x, function (cv) cv[1]) ... If you want the result as a character vector, see ?sapply sapply(x, function (cv) cv[1]) a b c Fee Ping Hi Or even sapply(x, [, 1) a b c Fee Ping Hi (same thing with lapply) -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] sub data frame by expression
Hi All, I've the following data frame with 54 rows and 4 colums: x Ratio Dose Time Batch R.010mM.04h.NEW0.02 010mM 04h NEW R.010mM.04h.NEW.1 0.07 010mM 04h NEW ... R.010mM.24h.NEW.2 0.06 010mM 24h NEW R.010mM.04h.OLD0.19 010mM 04h OLD ... R.010mM.04h.OLD.1 0.49 010mM 04h OLD R.100mM.24h.OLD0.40 100mM 24h OLD I'd like to create a sub data frame containing all rows where Batch == OLD and keeping the 4 colums. Assume that I don't know the order of the rows (otherwise I could just do something like x[1:20,]). I've tried x[x$Batch == 'OLD'] or x[x[,4] == 'OLD'] but it generates errors. So I assume I've still not realy understood the philosophy of indexing ... :-( What's the easiest way to do this, any suggestions? thanks a lot for you help, Arne __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] sub data frame by expression
Sorry, I just figured it out: x[x$Batch == 'OLD',] instead of x[x$Batch == 'OLD']. I didn't know this has to be in the same format then x[1:20,] where I already used the comma. sorry for posting the previous message ... Arne -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED] Sent: 17 October 2003 12:12 To: [EMAIL PROTECTED] Subject: [R] sub data frame by expression Hi All, I've the following data frame with 54 rows and 4 colums: x Ratio Dose Time Batch R.010mM.04h.NEW0.02 010mM 04h NEW R.010mM.04h.NEW.1 0.07 010mM 04h NEW ... R.010mM.24h.NEW.2 0.06 010mM 24h NEW R.010mM.04h.OLD0.19 010mM 04h OLD ... R.010mM.04h.OLD.1 0.49 010mM 04h OLD R.100mM.24h.OLD0.40 100mM 24h OLD I'd like to create a sub data frame containing all rows where Batch == OLD and keeping the 4 colums. Assume that I don't know the order of the rows (otherwise I could just do something like x[1:20,]). I've tried x[x$Batch == 'OLD'] or x[x[,4] == 'OLD'] but it generates errors. So I assume I've still not realy understood the philosophy of indexing ... :-( What's the easiest way to do this, any suggestions? thanks a lot for you help, Arne __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] sub data frame by expression
On Fri, 17 Oct 2003 [EMAIL PROTECTED] wrote: I've the following data frame with 54 rows and 4 colums: x Ratio Dose Time Batch R.010mM.04h.NEW0.02 010mM 04h NEW R.010mM.04h.NEW.1 0.07 010mM 04h NEW ... R.010mM.24h.NEW.2 0.06 010mM 24h NEW R.010mM.04h.OLD0.19 010mM 04h OLD ... R.010mM.04h.OLD.1 0.49 010mM 04h OLD R.100mM.24h.OLD0.40 100mM 24h OLD I'd like to create a sub data frame containing all rows where Batch == OLD and keeping the 4 colums. Assume that I don't know the order of the rows (otherwise I could just do something like x[1:20,]). I've tried x[x$Batch == 'OLD'] or x[x[,4] == 'OLD'] but it generates errors. That subsets columns, not rows. Try x[x$Batch == OLD,] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Design and Hmisc
On Wed, 15 Oct 2003 15:47:59 +0200 Uwe Ligges [EMAIL PROTECTED] wrote: Shawn Way wrote: I'm looking for design and hmisc version 2.0 for R 1.8 for windows. I've found design 2.0 in the downloads for R1.7 but not hmisc. I've also checked Dr. Harrell's site and it only goes to 1.6 for windows. Any thoughts? Yes. The ReadMe at CRAN/bin/windows/contrib/1.8 (for R-1.8.x) tells us: 'Packages that do not compile out of the box or do not pass Rcmd check with OK or at least a WARNING will *not* be published. This Status, i.e. result of Rcmd check, is listed in file Status. Possible values are OK, WARN, and ERROR. Corresponding check.log files can be found in subdirectory ./check.' Uwe Ligges Shawn Way Thanks to Uwe, there is now a new version of Hmisc for Windows on CRAN. The error in the help file for sas.get which prevented the Windows version from passing Rcmd check has been fixed. --- Frank E Harrell JrProfessor and ChairSchool of Medicine Department of BiostatisticsVanderbilt University __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] sub data frame by expression
Hi, thanks for your replies regarding the problem to select a sub data frame by expression. I start getting an understanding on how indexing works in R. thanks for your replies, Arne -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: 17 October 2003 12:38 To: Muller, Arne PH/FR Cc: [EMAIL PROTECTED] Subject: Re: [R] sub data frame by expression On Fri, 17 Oct 2003 [EMAIL PROTECTED] wrote: I've the following data frame with 54 rows and 4 colums: x Ratio Dose Time Batch R.010mM.04h.NEW0.02 010mM 04h NEW R.010mM.04h.NEW.1 0.07 010mM 04h NEW ... R.010mM.24h.NEW.2 0.06 010mM 24h NEW R.010mM.04h.OLD0.19 010mM 04h OLD ... R.010mM.04h.OLD.1 0.49 010mM 04h OLD R.100mM.24h.OLD0.40 100mM 24h OLD I'd like to create a sub data frame containing all rows where Batch == OLD and keeping the 4 colums. Assume that I don't know the order of the rows (otherwise I could just do something like x[1:20,]). I've tried x[x$Batch == 'OLD'] or x[x[,4] == 'OLD'] but it generates errors. That subsets columns, not rows. Try x[x$Batch == OLD,] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] R memory and CPU requirements
A couple of comments: o Methods such as decision trees do not need to expand factors into columns of 1df contrasts, so the memory requirement is vastly different. The models produced is also very, very different. o Why would you want all possible interactions of 10 variables, 6 of which are factors? How do you intend to interpret, e.g., the 6-factor interaction? What can you conclude about a significant 10-variable interaction? What is your ultimate goal for this exercise? Answer to that should help you decide on more reasonable models to fit. o One thing to try is fit the ANOVA model by hand by computing cell means and examine them. This avoids creating the huge design matrix that's mostly 0s. HTH, Andy -Original Message- From: Alexander Sirotkin [at Yahoo] [mailto:[EMAIL PROTECTED] Sent: Friday, October 17, 2003 4:30 AM To: John Fox Cc: [EMAIL PROTECTED] Subject: Re: [R] R memory and CPU requirements I agree completely. In fact, I have about 5000 observations, which should be enough. I was using 200 samples because of RAM limitations and I'm afraid to think about what amount of RAM I'll need to fit an aov() for such data. --- John Fox [EMAIL PROTECTED] wrote: Dear Alexander, If I understand you correctly, you have a sample of 200 observations. Even if you had only two factors with 40 levels each, the main effects and interactions of these factors would require about 1600 degrees of freedom -- that is, more than the number of observations. This doesn't make a whole lot of sense. I hope that this helps, John At 05:03 PM 10/16/2003 -0700, Alexander Sirotkin \[at Yahoo\] wrote: --- Deepayan Sarkar [EMAIL PROTECTED] wrote: On Thursday 16 October 2003 17:59, Alexander Sirotkin \[at Yahoo\] wrote: Thanks for all the help on my previous questions. One more (hopefully last one) : I've been very surprised when I tried to fit a model (using aov()) for a sample of size 200 and 10 variables and their interactions. That doesn't really say much. How many of these variables are factors ? How many levels do they have ? And what is the order of the interaction ? (Note that for 10 numeric variables, if you allow all interactions, then there will be a 100 terms in your model. This increases for factors.) In other words, how big is your model matrix ? (See ?model.matrix) Deepayan I see... Unfortunately, model.matrix() ran out of memory :) I have 10 variables, 6 of which are factor, 2 of which have quite a lot of levels (about 40). And I would like to allow all interactions. I understand your point about categorical variables, but still - this does not seem like too much data to me. I remmeber fitting all kinds of models (mostly decision trees) for much, much larger data sets. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help - John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: [EMAIL PROTECTED] phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox - __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo /r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] R memory and CPU requirements
Dear Alexander, At 01:29 AM 10/17/2003 -0700, Alexander Sirotkin \[at Yahoo\] wrote: I agree completely. In fact, I have about 5000 observations, which should be enough. I was using 200 samples because of RAM limitations and I'm afraid to think about what amount of RAM I'll need to fit an aov() for such data. OK -- I didn't realize that you have 5000 observations. Perhaps I didn't read some of the earlier messages carefully enough. At the risk of getting you to repeat information that you've already provided, how many degrees of freedom are there in the model that you're trying to fit? I can create a 5000 by 5000 model matrix on my relatively anemic Windows machine, and surely (unless there's some specification error) your model should have many fewer df than that if it includes just the main effects and two-way interactions (or by all interactions, do you mean higher-order interactions as well?). Perhaps providing the following information would help: What is the model formula? Which variables are factors? How many levels does each factor have? Regards, John --- John Fox [EMAIL PROTECTED] wrote: Dear Alexander, If I understand you correctly, you have a sample of 200 observations. Even if you had only two factors with 40 levels each, the main effects and interactions of these factors would require about 1600 degrees of freedom -- that is, more than the number of observations. This doesn't make a whole lot of sense. I hope that this helps, John At 05:03 PM 10/16/2003 -0700, Alexander Sirotkin \[at Yahoo\] wrote: --- Deepayan Sarkar [EMAIL PROTECTED] wrote: On Thursday 16 October 2003 17:59, Alexander Sirotkin \[at Yahoo\] wrote: Thanks for all the help on my previous questions. One more (hopefully last one) : I've been very surprised when I tried to fit a model (using aov()) for a sample of size 200 and 10 variables and their interactions. That doesn't really say much. How many of these variables are factors ? How many levels do they have ? And what is the order of the interaction ? (Note that for 10 numeric variables, if you allow all interactions, then there will be a 100 terms in your model. This increases for factors.) In other words, how big is your model matrix ? (See ?model.matrix) Deepayan I see... Unfortunately, model.matrix() ran out of memory :) I have 10 variables, 6 of which are factor, 2 of which have quite a lot of levels (about 40). And I would like to allow all interactions. I understand your point about categorical variables, but still - this does not seem like too much data to me. I remmeber fitting all kinds of models (mostly decision trees) for much, much larger data sets. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help - John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: [EMAIL PROTECTED] phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox - __ Do you Yahoo!? http://shopping.yahoo.com - John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: [EMAIL PROTECTED] phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] RE: [S] Dynamic Memory Allocation in R
From: Gamal Abdel-Azim [mailto:[EMAIL PROTECTED] While trying to expand the memory/object size in R, I noticed that R might be using only heap memory. Is this true? Are all objects in R created in the heap not allocated? It's not logical that this is the case!! Otherwise the whole R project would be a total waste of time and resources. If I am wrong please inform me. How to increase memory.size in R? Is there a way similar to options(object.size=size) in S-Plus. Notice that the command R --max- vsize=N targets the heap memory! I have recently installed R on a Linux machine (3GB RAM and sufficiently large HD). Sorry for posting to Splus not R. But if R works only in the heap, I may not need to subscribe to R-news at all. But only the R folks could give you the definitive answer to this question! Andy Thank You -- This message was distributed by [EMAIL PROTECTED] To unsubscribe send e-mail to [EMAIL PROTECTED] with the BODY of the message: unsubscribe s-news __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] R memory and CPU requirements
On 17 Oct 2003 at 1:33, Alexander Sirotkin [at Yahoo] wrote: You mentioned in an earlier post that at least one of your factors have 40 levels. If you use the default contrast, contrast.traetment, the design matrix for this factor will be dominated by zeros. Maybe you shoukd look at tha CRAN package SparseM, which have function slm for linear models with sparse matrices? (I did'nt try this , but it could be worthwile) Still, I don't think it makes much sense to start with a model with all the interactions in! Kjetil Halvorsen --- Deepayan Sarkar [EMAIL PROTECTED] wrote: On Thursday 16 October 2003 19:03, Alexander Sirotkin \[at Yahoo\] wrote: Thanks for all the help on my previous questions. One more (hopefully last one) : I've been very surprised when I tried to fit a model (using aov()) for a sample of size 200 and 10 variables and their interactions. That doesn't really say much. How many of these variables are factors ? How many levels do they have ? And what is the order of the interaction ? (Note that for 10 numeric variables, if you allow all interactions, then there will be a 100 terms in your model. This increases for factors.) In other words, how big is your model matrix ? (See ?model.matrix) Deepayan I see... Unfortunately, model.matrix() ran out of memory :) I have 10 variables, 6 of which are factor, 2 of which have quite a lot of levels (about 40). And I would like to allow all interactions. I understand your point about categorical variables, but still - this does not seem like too much data to me. That's one way to look at it. You don't have enough data for the model you are trying to fit. The usual approach under these circumstances is to try 'simpler' models. Please try to understand what you are trying to do (in this case by reading an introductory linear model text) before blindly applying a methodology. Deepayan I did study ANOVA and I do have enough observations. 200 was only a random sample of more then 5000 which I think should be enough. However, I'm afraid to even think about amount of RAM I will need with R to fit a model for this data. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] datetime data and plotting
If I take the following simple data: YEAR MONTH DAY WEIGHT.KG 2003 10 6 1.2 2003 10 12 1.2 2003 10 16 1.3 and format the date data and plot it: dates - strptime(paste(DAY,MONTH,YEAR),%d%m%Y) plot(c(min(dates),max(dates)),c(0,max(WEIGHT.KG)), xlab=Date,ylab=Weight (kg),type=n) lines(dates,WEIGHT.KG) points(dates,WEIGHT.KG) I find that the data points are all plotted at (x-1,y), where x is in days. Have I requested this behaviour accidentally? I'm using R-1.8 on OS X. Printing the dates object looks correct, and simple manipulations such as max(dates)-min(dates) behave normally. Jacob Etches Doctoral candidate Dept of Public Health Sciences University of Toronto __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Design and Hmisc
Thank you very much... Shawn Way -Original Message- From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED] Sent: Friday, October 17, 2003 6:38 AM To: Uwe Ligges Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [R] Design and Hmisc On Wed, 15 Oct 2003 15:47:59 +0200 Uwe Ligges [EMAIL PROTECTED] wrote: Shawn Way wrote: I'm looking for design and hmisc version 2.0 for R 1.8 for windows. I've found design 2.0 in the downloads for R1.7 but not hmisc. I've also checked Dr. Harrell's site and it only goes to 1.6 for windows. Any thoughts? Yes. The ReadMe at CRAN/bin/windows/contrib/1.8 (for R-1.8.x) tells us: 'Packages that do not compile out of the box or do not pass Rcmd check with OK or at least a WARNING will *not* be published. This Status, i.e. result of Rcmd check, is listed in file Status. Possible values are OK, WARN, and ERROR. Corresponding check.log files can be found in subdirectory ./check.' Uwe Ligges Shawn Way Thanks to Uwe, there is now a new version of Hmisc for Windows on CRAN. The error in the help file for sas.get which prevented the Windows version from passing Rcmd check has been fixed. --- Frank E Harrell JrProfessor and ChairSchool of Medicine Department of BiostatisticsVanderbilt University __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Problems with crossprod
Somehow R creates `a' as a matrix with 0 rows and 5 columns. I don't know how crossprod() or other linear algebra functions deals with such a degenerate matrix. I'd suggest R Core to add checks for strictly positive dimensions in such functions. (Also, I find it strange that A[1,] is a vector, but A[numeric(0),] is a 0x5 matrix...) Andy From: Giovanni Marchetti [mailto:[EMAIL PROTECTED] Dear R-users, I found a strange problem working with products of two matrices, say: a - A[i, ] ; crossprod(a) where i is a set of integers selecting rows. When i is empty the result is in a sense random. After some trials the right answer (a matrix of zeros) appears. --- Illustration R : Copyright 2003, The R Development Core Team Version 1.8.0 (2003-10-08) A -matrix(0, 5, 5) i - c() a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] 6.578187e-313 NaN NaN NaN NaN [2,] NaN 1.273197e-313 NaN 1.485397e-313 NaN [3,] NaN 4.243992e-313 2.121996e-314 NaN NaN [4,] NaN 1.697597e-313 NaN 4.880590e-313 NaN [5,] 5.941588e-313 NaN NaN 1.697597e-313 NaN a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] 2.121996e-314 5.729389e-313 NaN NaN NaN [2,] NaN NaN NaN NaN 1.909796e-313 [3,] 2.970794e-313 NaN NaN NaN NaN [4,] NaN NaN NaN 8.487983e-314 NaN [5,] NaN 6.365987e-313 2.546395e-313 NaN NaN a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] NaN 1.485397e-313 NaN NaN 2.970794e-313 [2,] 3.182994e-313 NaN NaN 1.060998e-313 NaN [3,] NaN NaN NaN 1.697597e-313 2.737375e-312 [4,] NaN NaN NaN NaN 2.048394e+10 [5,] NaN NaN NaN NaN 2.970794e-313 a - A[i, ] ; crossprod(a) [,1][,2] [,3] [,4] [,5] [1,] 1.591383e-266 20489834629000 [2,] 5.031994e-266 0000 [3,] 1.591205e-266 0000 [4,] 1.264128e-267 0000 [5,] 1.037656e-311 0000 a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 --- End of illustration The same problem does not appear using the matrix product: a - A[i, ] ; t(a) %*% a [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 Note that Splus 6 returns an error message: a - A[i, ] ; crossprod(a) Problem in .Fortran.ok.Internal(if(cmplx) zcrossp1..: subroutine dcrossp1: Argument 1 has zero length Thank you, Giovanni __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo /r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] sort charcters in W2K and NT
On Fri, 17 Oct 2003 11:58:49 +0200, Uwe Ligges [EMAIL PROTECTED] dortmund.de wrote: Ivar Herfindal wrote: Hello. I have a problem using sort() in windows 2000 and windows NT 4.0, running R 1.8.0 on both. I want to sort a vector of characters names, where I have used Scandinavian letters, like 'Æ', 'Ø', and 'Å' (for those who cannot display these letters this question seems rather meaningless, i guess). Windows 2000 sorts the vector like I am used to from other software, with 'Å' as the last letter in the alphabet, while windows NT has Å just after A, and Ø following O. Is there a way to solve this problem (other than replace the Scandinavian letters)? A short example: sort(c('a','p','å')) # on windows 2000: [1] a p å # on windows NT [1] a å p Thanks in advance ?sort tells us: The sort order for character vectors will depend on the collating sequence of the locale in use: see Comparison. and ?Comparison points you to ?locales which gives an example: Sys.setlocale(LC_COLLATE, C) # turn off locale-specific sorting Uwe Ligges Thanks for the help, it worked great. However, it appers that using the Sys.setlocale(LC_COLLATE, C) makes R sort the vector in a new way, different from the two mentioned above. But since R sorts character vectors at same manner on both W2K and Window NT, after writing Sys.setlocale(LC_COLLATE, C), it is sufficient for me. Ivar Herfindal Ivar Herfindal On windows 2000: version _ platform i386-pc-mingw32 arch i386 os mingw32system i386, mingw32 status major1 minor8.0 year 2003 month10 day 08 language R On windows NT: version _ platform i386-pc-mingw32 arch i386 os mingw32system i386, mingw32 status major1 minor8.0 year 2003 month10 day 08 language R __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] R memory and CPU requirements
On Fri, 17 Oct 2003, Alexander Sirotkin [at Yahoo] wrote: I did study ANOVA and I do have enough observations. 200 was only a random sample of more then 5000 which I think should be enough. However, I'm afraid to even think about amount of RAM I will need with R to fit a model for this data. The memory requirements depend on the size of the design matrix. If the number of columns in the design matrix doesn't increase then 5000 observations won't be much worse. If it does increase then you have the same problem of sparseness even with 5000 observations. There must be *something* strange about your model. I routinely fit regression models with 6000 observations and a dozen or so variables in R, in much less than 2Gb of RAM. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] RE: [S] Dynamic Memory Allocation in R
On Fri, 17 Oct 2003, Liaw, Andy wrote: From: Gamal Abdel-Azim [mailto:[EMAIL PROTECTED] While trying to expand the memory/object size in R, I noticed that R might be using only heap memory. Is this true? Are all objects in R created in the heap not allocated? To the extent that this means anything it is true. All R objects are stored in memory controlled by the R process, in two heaps. The g.data package allows objects to be stored on disk and only loaded as necessary. It's not logical that this is the case!! Otherwise the whole R project would be a total waste of time and resources. There appear to be some missing steps in this reasoning. If I am wrong please inform me. How to increase memory.size in R? Is there a way similar to options(object.size=size) in S-Plus. Notice that the command R --max- vsize=N targets the heap memory! The --max-vsize option exists to *reduce* the maximum memory use, not to increase it (at least under Unix). You have access to all the memory your operating system will give you. I have recently installed R on a Linux machine (3GB RAM and sufficiently large HD). Then you should be able to access nearly 4Gb in R, which is enough for quite a lot of purposes. Sorry for posting to Splus not R. But if R works only in the heap, I may not need to subscribe to R-news at all. ! -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Problems with crossprod
On Fri, 17 Oct 2003, Liaw, Andy wrote: Somehow R creates `a' as a matrix with 0 rows and 5 columns. I don't know how crossprod() or other linear algebra functions deals with such a degenerate matrix. I'd suggest R Core to add checks for strictly positive dimensions in such functions. Yes. There usually are, which is why the matrix multiplication version works (Also, I find it strange that A[1,] is a vector, but A[numeric(0),] is a 0x5 matrix...) Why? A[1,] is a 1x5 matrix, ie, a column vector, so it makes sense for it to decay to a vector. A[numeric(0),] is not a vector, so it stays a matrix. -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] sort charcters in W2K and NT
You can set any locale you like, and I suspect your machines are in different locales (I believe older versions of Windows, including NT4, had limited support for locales). On Fri, 17 Oct 2003, Ivar Herfindal wrote: On Fri, 17 Oct 2003 11:58:49 +0200, Uwe Ligges [EMAIL PROTECTED] dortmund.de wrote: Ivar Herfindal wrote: Hello. I have a problem using sort() in windows 2000 and windows NT 4.0, running R 1.8.0 on both. I want to sort a vector of characters names, where I have used Scandinavian letters, like 'Æ', 'Ø', and 'Å' (for those who cannot display these letters this question seems rather meaningless, i guess). Windows 2000 sorts the vector like I am used to from other software, with 'Å' as the last letter in the alphabet, while windows NT has Å just after A, and Ø following O. Is there a way to solve this problem (other than replace the Scandinavian letters)? A short example: sort(c('a','p','å')) # on windows 2000: [1] a p å # on windows NT [1] a å p Thanks in advance ?sort tells us: The sort order for character vectors will depend on the collating sequence of the locale in use: see Comparison. and ?Comparison points you to ?locales which gives an example: Sys.setlocale(LC_COLLATE, C) # turn off locale-specific sorting Uwe Ligges Thanks for the help, it worked great. However, it appers that using the Sys.setlocale(LC_COLLATE, C) makes R sort the vector in a new way, different from the two mentioned above. But since R sorts character vectors at same manner on both W2K and Window NT, after writing Sys.setlocale(LC_COLLATE, C), it is sufficient for me. Ivar Herfindal Ivar Herfindal On windows 2000: version _ platform i386-pc-mingw32 arch i386 os mingw32system i386, mingw32 status major1 minor8.0 year 2003 month10 day 08 language R On windows NT: version _ platform i386-pc-mingw32 arch i386 os mingw32system i386, mingw32 status major1 minor8.0 year 2003 month10 day 08 language R __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Problems with crossprod
On Fri, 17 Oct 2003, Liaw, Andy wrote: Somehow R creates `a' as a matrix with 0 rows and 5 columns. I don't know how crossprod() or other linear algebra functions deals with such a degenerate matrix. I'd suggest R Core to add checks for strictly positive dimensions in such functions. Rather, we do try to ensure they give the right answers for 0 extents. The code has lines like if (nrx 0 ncx 0 nry 0 ncy 0) { F77_CALL(dgemm)(transa, transb, ncx, ncy, nrx, one, x, nrx, y, nry, zero, z, ncx); } and so does nothing for 0 extents! The corresponding matprod code does, so this seems a simple oversight that will be fixed shortly. (Also, I find it strange that A[1,] is a vector, but A[numeric(0),] is a 0x5 matrix...) That's the point of drop = TRUE (the default): it drops extents of length 1 (and not 0). Andy From: Giovanni Marchetti [mailto:[EMAIL PROTECTED] Dear R-users, I found a strange problem working with products of two matrices, say: a - A[i, ] ; crossprod(a) where i is a set of integers selecting rows. When i is empty the result is in a sense random. After some trials the right answer (a matrix of zeros) appears. --- Illustration R : Copyright 2003, The R Development Core Team Version 1.8.0 (2003-10-08) A -matrix(0, 5, 5) i - c() a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] 6.578187e-313 NaN NaN NaN NaN [2,] NaN 1.273197e-313 NaN 1.485397e-313 NaN [3,] NaN 4.243992e-313 2.121996e-314 NaN NaN [4,] NaN 1.697597e-313 NaN 4.880590e-313 NaN [5,] 5.941588e-313 NaN NaN 1.697597e-313 NaN a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] 2.121996e-314 5.729389e-313 NaN NaN NaN [2,] NaN NaN NaN NaN 1.909796e-313 [3,] 2.970794e-313 NaN NaN NaN NaN [4,] NaN NaN NaN 8.487983e-314 NaN [5,] NaN 6.365987e-313 2.546395e-313 NaN NaN a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,] NaN 1.485397e-313 NaN NaN 2.970794e-313 [2,] 3.182994e-313 NaN NaN 1.060998e-313 NaN [3,] NaN NaN NaN 1.697597e-313 2.737375e-312 [4,] NaN NaN NaN NaN 2.048394e+10 [5,] NaN NaN NaN NaN 2.970794e-313 a - A[i, ] ; crossprod(a) [,1][,2] [,3] [,4] [,5] [1,] 1.591383e-266 20489834629000 [2,] 5.031994e-266 0000 [3,] 1.591205e-266 0000 [4,] 1.264128e-267 0000 [5,] 1.037656e-311 0000 a - A[i, ] ; crossprod(a) [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 --- End of illustration The same problem does not appear using the matrix product: a - A[i, ] ; t(a) %*% a [,1] [,2] [,3] [,4] [,5] [1,]00000 [2,]00000 [3,]00000 [4,]00000 [5,]00000 Note that Splus 6 returns an error message: a - A[i, ] ; crossprod(a) Problem in .Fortran.ok.Internal(if(cmplx) zcrossp1..: subroutine dcrossp1: Argument 1 has zero length Thank you, Giovanni __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo /r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] help with legend()
On Friday 17 October 2003 02:20, Martin Maechler wrote: PaulSch == Schwarz, Paul [EMAIL PROTECTED] on Wed, 15 Oct 2003 12:09:11 -0700 writes: PaulSch I am converting some S-PLUS scripts that I use for PaulSch creating manuscript figures to R so that I can take PaulSch advantage of the plotmath capabilities. In my PaulSch S-PLUS scripts I like to use the key() function for PaulSch adding legends to plots, AFAIK key() in S+ is from the trellis library section. The corresponding R package, trellis, has ^^^ lattice, actually :-) a draw.key() function that may work similarly to S-plus' key() {Deepayan ?}. That's correct. Of course, the S-PLUS key() works wih non-trellis graphs as well, whereas draw.key() will produce a grid object and hence work with grid graphics only. (I haven't checked Paul's new gridBase package, that may enable using this for base graphics as well.) PaulSch and I have a couple of PaulSch questions regarding using the legend() function in PaulSch R. PaulSch 1) is there a way to specify different colors for PaulSch the legend vector of text values? not yet in legend() -- but see below PaulSch 2) is there a way to reverse the order of the PaulSch legend items so that the text values precede the PaulSch symbols? not yet in legend() --- but it's an open source project living from community support ... Can S+ key() do these two things? If yes, how do you specify it there {this sounds as if I was willing to consider adding these wished features to legend } key() is a bit weird, in that it allows multiple arguments of the same name (as long as the names are text, points, lines and rectangles). The order of the arguments control the order of column types. For example, key(text = list(letters[1:5], col = 1:5), points = list(col = 1:5), text = list(letters[6:10])) will produce a column of text followed by points and then text again (with the first two columns in different color). Deepayan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] correlation matrix in Hmisc
Dear all, I am trying to compute a matrix of Pearson's `r' or Spearman's `rho' rank correlation coefficients using rcorr (Hmisc) the following way: mx-rcorr(x, type=spearman)[1] but then ... is.matrix(mx) [1] FALSE Even if I use as.matrix the result is not better. What can I do? Thank you all Luca __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] correlation matrix in Hmisc
On Fri, 17 Oct 2003 16:36:47 +0200 Luca De Benedictis [EMAIL PROTECTED] wrote: Dear all, I am trying to compute a matrix of Pearson's `r' or Spearman's `rho' rank correlation coefficients using rcorr (Hmisc) the following way: mx-rcorr(x, type=spearman)[1] Instead of [1] use $r. Or use the new cor() function builtin to R 1.8. Frank but then ... is.matrix(mx) [1] FALSE Even if I use as.matrix the result is not better. What can I do? Thank you all Luca __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help --- Frank E Harrell JrProfessor and ChairSchool of Medicine Department of BiostatisticsVanderbilt University __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] R memory and CPU requirements
On Friday 17 October 2003 03:33, Alexander Sirotkin \[at Yahoo\] wrote: One more (hopefully last one) : I've been very surprised when I tried to fit a model (using aov()) for a sample of size 200 and 10 variables and their interactions. That doesn't really say much. How many of these variables are factors ? How many levels do they have ? And what is the order of the interaction ? (Note that for 10 numeric variables, if you allow all interactions, then there will be a 100 terms in your model. This increases for factors.) In other words, how big is your model matrix ? I see... Unfortunately, model.matrix() ran out of memory :) I have 10 variables, 6 of which are factor, 2 of which have quite a lot of levels (about 40). And I would like to allow all interactions. I understand your point about categorical variables, but still - this does not seem like too much data to me. That's one way to look at it. You don't have enough data for the model you are trying to fit. The usual approach under these circumstances is to try 'simpler' models. Please try to understand what you are trying to do (in this case by reading an introductory linear model text) before blindly applying a methodology. Deepayan I did study ANOVA and I do have enough observations. 200 was only a random sample of more then 5000 which I think should be enough. However, I'm afraid to even think about amount of RAM I will need with R to fit a model for this data. Let's see. You have 10 variables, 6 of which are factors, 2 of which have at least 40 levels, and you want all interactions. Let's conservatively estimate that all the other four factors have only 2 levels. x1 = gl(40, 1, 1) x2 = gl(40, 1, 1) x3 = gl(2, 1, 1) x4 = gl(2, 1, 1) x5 = gl(2, 1, 1) x6 = gl(2, 1, 1) dim(model.matrix(~ x1 * x2 * x3 * x4 * x5 * x6)) [1] 1 25600 This was for one data point, increasing that would only increase the number of rows, the columns would be the same. And of course, this is just for 6-way interactions, and the least possible given the information you have given us about your model. In actual fact, your model matrix will have many many more columns. I hope you realize that the number of columns in the model matrix is the number of parameters you are trying to estimate. If your sample size is less than this number (and 5000 is way less), then there will be infinitely many solutions to this problem, each of which will fit your data perfectly. Do you really want such an answer ? Assuming that you find one, what are you going to do with it ? I have no idea what made you choose such an high order model, but as Andy has said, you really should try to figure out what exactly your goals are before proceeding. If you believe that your data can really not be modeled reasonably by anything simpler, you probably should not use a linear model at all. Hope that helps, Deepayan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Problems Building RMySQL in Windows
Regarding the very few tests I did (RMySQL versus RODBC using a MySQL ODBC driver, but I do not remember details here), RMySQL is faster. It should be great, if you need to access a MySQL database from R, to try both and decide by yourself. If you do that, I am very interested by the results. Best, Philippe Grosjean -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Hector Villafuerte D. Sent: Thursday, 16 October, 2003 20:19 To: [EMAIL PROTECTED] Subject: Re: [R] Problems Building RMySQL in Windows David Whiting wrote: Can you use RODBC instead? I use it all the time and find it works very well. I installed it some time ago and have forgotten the exact details but I remember that it was easy to do following the instructions. I also can't remember why I choose RODBC instead of RMySQL and whether there are advantages of RMySQL over RODBC. Great! Using RODBC is really easy! Would someone please comment on the pros and cons of RODBC compared with RMySQL? Thanks in advance. Hector __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] environments
Hi, I have a string representing an environment: bob And an environment bob environment: 0x3901234ac How do write a function that takes the string and returns the environment? Crispin This email is confidential and intended solely for the use o...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Problems Building RMySQL in Windows
Philippe Grosjean wrote: Regarding the very few tests I did (RMySQL versus RODBC using a MySQL ODBC driver, but I do not remember details here), RMySQL is faster. It should be great, if you need to access a MySQL database from R, to try both and decide by yourself. If you do that, I am very interested by the results. I would gladly do such comparison but I was unable to install RMySQL successfully (it keeps crashing R). I think I'll have to stick with RODBC. Thank you all, anyway. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] datetime data and plotting
I am not seeing this on Linux. The x axis marks are at midnight GMT, hence 1am BST on my system. On Fri, 17 Oct 2003, Jacob Etches wrote: If I take the following simple data: YEAR MONTH DAY WEIGHT.KG 2003 10 6 1.2 2003 10 12 1.2 2003 10 16 1.3 and format the date data and plot it: dates - strptime(paste(DAY,MONTH,YEAR),%d%m%Y) plot(c(min(dates),max(dates)),c(0,max(WEIGHT.KG)), xlab=Date,ylab=Weight (kg),type=n) lines(dates,WEIGHT.KG) points(dates,WEIGHT.KG) I find that the data points are all plotted at (x-1,y), where x is in days. Have I requested this behaviour accidentally? I'm using R-1.8 on OS X. Printing the dates object looks correct, and simple manipulations such as max(dates)-min(dates) behave normally. Jacob Etches Doctoral candidate Dept of Public Health Sciences University of Toronto -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] about parameter fitting of Gld(Generalized Lambda Distribution)
Currently, I am intrested in parameter fitting of Generalized Lambda Distribution.And I have found two packages in R related to Gld,Davies and gld. What's a pity that no method in Davies deals with fitting of gld,and starship used in package:gld is quite time-consuming when sample size is large. So i wonder if there is method of moments or least square implementation available at now. or if u know the method of moments(or moment matching), i think u must know how to solve a optimization problem of two parameters,involing multiple beta functions.plz give me some hints. Thanks in advance. Regards, Jean Sun __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] environments
Crispin Miller wrote: Hi, I have a string representing an environment: bob And an environment bob environment: 0x3901234ac How do write a function that takes the string and returns the environment? get(bob) Uwe Ligges Crispin This email is confidential and intended solely for the use o...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] environments
Is get(bob) what you are looking for? It is the usual way to go from the name of an R object (as a character string) to the actual object. On Fri, 17 Oct 2003, Crispin Miller wrote: Hi, I have a string representing an environment: bob And an environment bob environment: 0x3901234ac How do write a function that takes the string and returns the environment? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] heatmap function
AndyL == Liaw, Andy [EMAIL PROTECTED] on Fri, 17 Oct 2003 09:10:16 -0400 writes: AndyL One of the good thing about R (and S in general, I AndyL guess) is that if a function does mostly what you AndyL want, except for some small things, you can just make AndyL another copy of it, change the name, and make the AndyL desired changes to the new function (provided the AndyL changes you need to make isn't in the compiled code, AndyL but R is Open Source...). AndyL In this case, you should be able to strip out the AndyL code in heatmap() that plot the top dendrogram w/o AndyL much problem. While your at it, you might want to AndyL change the layout() so as not to leave the blank AndyL space on top. Yes, thanks Andy. heatmap() has already been improved quite a bit for R 1.8.0 (and particularly the dendrogram reordering which lead to bad drawings has been fixed, the drawings are now fine). But I have received many suggestions (from Gregory Warnes, notably, and Art Owen, and others) that just didn't make it anymore in time before feature freeze. The above {an option for *dis*allowing one or the other dendrogram} has been among the wishes, and is reasonable. heatmap() being a relatively new function in R, and a high level one (i.e. typically not used as basic building bloc for other functions), not even usable as a sub-plot in other plots because it relies on layout(), but also widely used in some contexts I'd vote for being allowed to add features to it even before the next major release of R. -Original Message- From: Martin Olivier [mailto:[EMAIL PROTECTED] Sent: Friday, October 17, 2003 5:32 AM To: r-help Subject: [R] heatmap function Hi all, By default, the heatmap function gives an image with a dendrogram added to the left side and to the top. Is it possible to only add the dendrogram to the left side and let the order of the columns unchanged ? I tried heatmap(mat, col=rbg,Rowv=res.hclust$order,Colv=1:dim(mat)[[2]]). In this case, the order of the columns are unchanged but a dendrogram is added to the top. How can I avoid it? Thanks, Oiliver __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] tick marks and barchart
On Thursday 16 October 2003 10:03, Stefán Hrafn Jónsson wrote: Dear R community. I have two problems with figures. First deals with short vector on the x-axis and the second with two-panel barchart. 1) For demonstration I create the following pseudo data for three years, 2001:2003. The indicated plot looks fine except for the number of tick marks on the x-axis. I get seq(2001,2003,0.5). I want three and only three tick marks to indicate we have measure once a year not two times each year. (Having year 2001.5 is not that nice anyway). I tried as.factor(2001:2003) but this did not do what I want. I have considered having no labels and plot the year with text(y=-b,x=(2001,2002,2003), (2001:2003) ) -b being some value less than 0. A simpler version is preferred. You can suppress the axes during the first plot() call and then construct them manually: ##--- demo1 - matrix(nrow=3, ncol =2, log(c(7,3,2,4,5,6))/log(7) , dimnames =list(as.character(2001:2003), c(Group A,Group B)) ) par(lab=c(3, 6,7) ,las=1 ) plot(x = (2001:2003), y = demo1[,1]*100, type = l, lwd=3,ylim=c(0,100), xlab = , ylab=%, axes = FALSE, frame.plot = TRUE) axis(2) axis(1, at = 2001:2003) lines(x=(2001:2003), y=demo1[,2]*100, lwd=3, xlab = Year ) ##- A slightly better approach (not for your problem, but for what you are trying to do) would be to use matplot instead: ## matplot(x = 2001:2003, demo1 * 100, type = l, lwd=3, ylim = c(0, 100), xlab = , ylab=%, axes = FALSE, frame.plot = TRUE) axis(2) axis(1, at = 2001:2003) ## 2) For the second problem I want to use the same data but create a barchart with two bars (Group A and group B) for 2001, same two groups for 2002 and same two for 2003. Group A would have blue bars and Group B red bars. Would I use barchart() or panel.barchart()? Looking in help(barchart) I find that I need to define a formula. What would the x, y and g1 be in my case? barchart() is the trellis/lattice function for drawing barcharts, the corresponding base function is barplot. In this case, what you want should be doable with barplot(t(demo1), beside = TRUE) HTH, Deepayan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Someone just searched for word-of-mouth information about: r-help@lists.r-project.org
WordofMouthConnection.com Search Awareness System This email is a website-generated message, but it is not spam. An acquaintance of yours recently conducted a search on your email address in our online community, WordofMouthConnection.com. It could be a friend, a family member, co-worker, business associate, or someone else who's interested in learning more about you. Why are we sending you this email? When people find out others are talking about them -- whether it is good or bad -- they want to know. At WordofMouthConnection.com, we feel responsible to alert people so they have an opportunity to find out what is being said. Click here to view all Word-of-Mouth Connections in our system regarding this email address: http://womc.net/pass.php?a=searchb=5[EMAIL PROTECTED] How did we find you? When your acquaintance searched for connections at our website he/she provided us with your email address. What can you do? First, we'd like to invite you to visit our website at www.WordofMouthConnection.com and learn about our service. Registration is free. Not only can you find out if any WordofMouthConnection.com Connections have been submitted about you, but you can connect to others as well to research word-of-mouth information about your friends, co-workers, family, etc. We are solely interested in fostering a community of willing people to promote information exchange. If you decide you're not interested, you can choose to block future emails from us. To add your email address to our Do Not Email List use the following link: http://womc.net/pass.php?a=donotemail[EMAIL PROTECTED]c=true *Important - Word-of-Mouth Connections are only for the purpose of identifying the connection subject. What you see is what you get, in terms of search results. WordofMouthConnection.com is an online community that helps connect people so that they can talk at greater length and in greater detail. None of the information that is exchanged resides within the website itself. If you have any questions or comments please email us at: http://womc.net/pass.php?a=contact Sincerely, WordofMouthConnection.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Modifying dim attribute of elements of a list
I am creating lists of vectors withing a loop. I also would like to change the dim attribute to the vectors in order to make them matrices. I have tried the following, but it doesn't work... sim - c('simMeans','simVars','simWeights') indexTable - table(modelIndex) for (i in sim) { + assign(tmp - paste(i,'By',sep=''),split(get(i), modelIndex)) + lapply(seq(along=indexTable),function(j) attr(get(tmp)[[j]],'dim') - c(indexTable[j],K)) + } Error in FUN(X[[1]], ...) : couldn't find function get- In addition: Warning message: argument lengths differ in: split(x, f) Any suggestions will be appreciated. Thanks __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Opening - Director Biostatistics - Cambridge MA
We are currently looking for a Director Biostatistics and Data Management in our Cambridge MA facility. We are looking for 7+ years of Statistical Analysis in a Biotech / Pharmaceutical environment with Phase 2 and Phase 3 clinical trial data. Experience with clinical protocol design, coordination of clinical database design requirements, statistical planning and preparation of regulatory submissions. Experience in management of clinical statistical contractors and CRO. Phd / MS in Bio-Statistics or Statistics, and demonstrated SAS programming skills. Biopure Corporation is a leading developer, manufacturer and supplier of a new class of pharmaceuticals, called oxygen therapeutics, which are intravenously administered to deliver oxygen to the body's tissues as a sterile alternative to red blood cell transfusion. Biopure's oxygen therapeutics possess unique attributes that address many of the medical and logistical issues associated with red blood cell transfusions. If you or any of your associates are interested in our Cambridge, MA openings please forward a resume to [EMAIL PROTECTED] or fax to 617-234-6505. Thank you for your interest in Biopure. John Rynak Biopure Corporation 617-234-6835 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] datetime data and plotting
The problem is related to time zones. The easiest way to handle this is to avoid using POSIXt and use chron instead so you don't have to worry about them. require(chron) day - 6:16 dts - dates(paste(10, day, 03, sep=/)) plot(dts,day) abline(v=dts) --- From: Jacob Etches [EMAIL PROTECTED] If I take the following simple data: YEAR MONTH DAY WEIGHT.KG 2003 10 6 1.2 2003 10 12 1.2 2003 10 16 1.3 and format the date data and plot it: dates - strptime(paste(DAY,MONTH,YEAR),%d%m%Y) plot(c(min(dates),max(dates)),c(0,max(WEIGHT.KG)), xlab=Date,ylab=Weight (kg),type=n) lines(dates,WEIGHT.KG) points(dates,WEIGHT.KG) I find that the data points are all plotted at (x-1,y), where x is in days. Have I requested this behaviour accidentally? I'm using R-1.8 on OS X. Printing the dates object looks correct, and simple manipulations such as max(dates)-min(dates) behave normally. Jacob Etches Doctoral candidate Dept of Public Health Sciences University of Toronto ___ No banners. No pop-ups. No kidding. Introducing My Way - http://www.myway.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Someone just searched for word-of-mouth information about: r-help@lists.r-project.org
WordofMouthConnection.com Search Awareness System This email is a website-generated message, but it is not spam. An acquaintance of yours recently conducted a search on your email address in our online community, WordofMouthConnection.com. It could be a friend, a family member, co-worker, business associate, or someone else who's interested in learning more about you. Why are we sending you this email? When people find out others are talking about them -- whether it is good or bad -- they want to know. At WordofMouthConnection.com, we feel responsible to alert people so they have an opportunity to find out what is being said. Click here to view all Word-of-Mouth Connections in our system regarding this email address: http://womc.net/pass.php?a=searchb=5[EMAIL PROTECTED] How did we find you? When your acquaintance searched for connections at our website he/she provided us with your email address. What can you do? First, we'd like to invite you to visit our website at www.WordofMouthConnection.com and learn about our service. Registration is free. Not only can you find out if any WordofMouthConnection.com Connections have been submitted about you, but you can connect to others as well to research word-of-mouth information about your friends, co-workers, family, etc. We are solely interested in fostering a community of willing people to promote information exchange. If you decide you're not interested, you can choose to block future emails from us. To add your email address to our Do Not Email List use the following link: http://womc.net/pass.php?a=donotemail[EMAIL PROTECTED]c=true *Important - Word-of-Mouth Connections are only for the purpose of identifying the connection subject. What you see is what you get, in terms of search results. WordofMouthConnection.com is an online community that helps connect people so that they can talk at greater length and in greater detail. None of the information that is exchanged resides within the website itself. If you have any questions or comments please email us at: http://womc.net/pass.php?a=contact Sincerely, WordofMouthConnection.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] don't display rulers in image() command and script file input
Ernie Adorio wrote: Dear R experts, 1. How can I turn off the display of rulers in image() command? Are rulers the same as axes? If so, try image(...,axes = FALSE) See example(image) for detials. 2. Rather than typing my commands at the command line, how can I input a file contents aside from doing a copy and paste operation? ?source Also, see the FAQ, 7.18. Cheers Jason -- Indigo Industrial Controls Ltd. http://www.indigoindustrial.co.nz 64-21-343-545 [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] datetime data and plotting
On Fri, Oct 17, 2003 at 01:16:00PM -0400, Gabor Grothendieck wrote: The problem is related to time zones. The easiest way to handle this is to avoid using POSIXt and use chron instead so you don't have to worry about them. require(chron) day - 6:16 dts - dates(paste(10, day, 03, sep=/)) plot(dts,day) abline(v=dts) I don't think I'd call that easiest. Jacob simply did not specify hour, minute and second for a display where it mattered (mostly because he only plotted 3 points, with 300 it would have close to impossible to discern). One way to address this would be to give a hour and minute as in dates - strptime(paste(DAY,MONTH,YEAR,23:59),%d %m %Y %H:%M) (where I also adjust the format string for the space paste() adds). The three plot commands can also be combined into plot(dates,WEIGHT.KG, ylim=c(0,max(WEIGHT.KG)), xlab=Date,ylab=Weight (kg),type=o) Hth, Dirk --- From: Jacob Etches [EMAIL PROTECTED] If I take the following simple data: YEAR MONTH DAY WEIGHT.KG 2003 10 6 1.2 2003 10 12 1.2 2003 10 16 1.3 and format the date data and plot it: dates - strptime(paste(DAY,MONTH,YEAR),%d%m%Y) plot(c(min(dates),max(dates)),c(0,max(WEIGHT.KG)), xlab=Date,ylab=Weight (kg),type=n) lines(dates,WEIGHT.KG) points(dates,WEIGHT.KG) I find that the data points are all plotted at (x-1,y), where x is in days. Have I requested this behaviour accidentally? I'm using R-1.8 on OS X. Printing the dates object looks correct, and simple manipulations such as max(dates)-min(dates) behave normally. Jacob Etches Doctoral candidate Dept of Public Health Sciences University of Toronto ___ No banners. No pop-ups. No kidding. Introducing My Way - http://www.myway.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Those are my principles, and if you don't like them... well, I have others. -- Groucho Marx __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Lilliefors Test
Hello everybody, I would like to perform a test for normality (without specifying the mean a variance) on the sample data (80 observations). I found that Lilliefors test is appropriate. Does anybody have it programmed already, or is there a function for this test in R? Thank you very much, Martina Pavlicova -- Department of Statistics Office Phone: (614) 292-1567 1958 Neil Avenue, 304E Cockins Hall FAX: (614) 292-2096 The Ohio State UniversityE-mail: [EMAIL PROTECTED] Columbus, OH 43210-1247 www.stat.ohio-state.edu/~pavlicov __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Rd problems
On 13 Oct 2003 at 8:45, Martin Maechler wrote: kjetil == kjetil halvorsen [EMAIL PROTECTED] on Sun, 12 Oct 2003 09:55:00 -0400 writes: kjetil Hola! I have the following in a .Rd file: kjetil \eqn{\mbox{coef} = c(\mbox{coef}[1],\ldots, \mbox{coef}[n]) } kjetil {coef = c(coef[1], coef[2], \dots, coef[n])} kjetil However, both arguments come out in the latex file! kjetil Whats happening? \eqn comes in a 1-argument and 2-argument version. If you want the 2-argument version, you cannot put spaces between the ending } of the 1st arg and the starting { of the 2nd one. Instead of the above, use \eqn{\mbox{coef} = c(\mbox{coef}[1],\ldots, \mbox{coef}[n]) }{% coef = c(coef[1], coef[2], \dots, coef[n])} (note the comment % after the opening { ) Thanks!, but I did'nt get this to work with the % trick, I had to put everything on one line as Brian Ripley said. By the way , \deqn{} {} works fine. Kjetil Halvorsen Martin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] datetime data and plotting
From: Dirk Eddelbuettel [EMAIL PROTECTED] On Fri, Oct 17, 2003 at 01:16:00PM -0400, Gabor Grothendieck wrote: The problem is related to time zones. The easiest way to handle this is to avoid using POSIXt and use chron instead so you don't have to worry about them. require(chron) day - 6:16 dts - dates(paste(10, day, 03, sep=/)) plot(dts,day) abline(v=dts) I don't think I'd call that easiest. Jacob simply did not specify hour, minute and second for a display where it mattered (mostly because he only plotted 3 points, with 300 it would have close to impossible to discern). One way to address this would be to give a hour and minute as in dates - strptime(paste(DAY,MONTH,YEAR,23:59),%d %m %Y %H:%M) (where I also adjust the format string for the space paste() adds). The three plot commands can also be combined into plot(dates,WEIGHT.KG, ylim=c(0,max(WEIGHT.KG)), xlab=Date,ylab=Weight (kg),type=o) Unfortunately, that solution will not work in all time zones. For example, to get dates to line up in my time zone (Eastern Daylight Time) I would have to do this: day - 6:16 dts - strptime(paste(day,10,2003,20:00),%d %m %Y %H:%M) plot(dts,day) abline(v=as.POSIXct(dts)) Its currently daylight savings time where I am but we are soon going to change to standard time for the winter which will force this to change shortly so even the above is not sufficient. Time zones are not part of the problem yet POSIXt forces this extraneous complication on you. chron has no time zones in the first place and therefore allows you to work in the natural frame of the problem, avoiding subtle problems like this. This sort of thing has been discussed a number of times and I had previously suggested that chron be moved to the base or else that a timezone-less version of POSIXt be added to the base. See: https://stat.ethz.ch/pipermail/r-devel/2003-August/027269.html (I am using R 1.7.1 on Windows 2000.) ___ No banners. No pop-ups. No kidding. Introducing My Way - http://www.myway.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] datetime data and plotting
I do see the described behavior, on three systems, linux R 1.8.0, Mac OS X R 1.8.0, and Solaris R 1.7.1. Plot 1 is different than plot 2; in plot 1 the points are offset to the left of the axis tick marks. datet - as.POSIXct(dates) ## 1 plot(datet,WEIGHT.KG) ## 2 plot(datet,WEIGHT.KG,xaxt='n') axis.POSIXct(1,at=datet) To investigate a bit, I made a copy of axis.POSIXct and modified it slightly to return the value of at that it calculates. I get this: 2003-10-06 17:00:00 PDT 2003-10-08 17:00:00 PDT 2003-10-10 17:00:00 PDT 2003-10-12 17:00:00 PDT 2003-10-14 17:00:00 PDT These are equal to midnight GMT, since my systems are currently in PDT, i.e. GMT-7. version _ platform powerpc-apple-darwin6.7.5 arch powerpc os darwin6.7.5 system powerpc, darwin6.7.5 status Patched major1 minor8.0 year 2003 month10 day 13 language R version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status Patched major1 minor8.0 year 2003 month10 day 16 language R version _ platform sparc-sun-solaris2.7 arch sparc os solaris2.7 system sparc, solaris2.7 status major1 minor7.1 year 2003 month06 day 16 language R -Don At 9:21 AM -0400 10/17/03, Jacob Etches wrote: If I take the following simple data: YEAR MONTH DAY WEIGHT.KG 2003 10 6 1.2 2003 10 12 1.2 2003 10 16 1.3 and format the date data and plot it: dates - strptime(paste(DAY,MONTH,YEAR),%d%m%Y) plot(c(min(dates),max(dates)),c(0,max(WEIGHT.KG)), xlab=Date,ylab=Weight (kg),type=n) lines(dates,WEIGHT.KG) points(dates,WEIGHT.KG) I find that the data points are all plotted at (x-1,y), where x is in days. Have I requested this behaviour accidentally? I'm using R-1.8 on OS X. Printing the dates object looks correct, and simple manipulations such as max(dates)-min(dates) behave normally. Jacob Etches Doctoral candidate Dept of Public Health Sciences University of Toronto __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] gcc for SuSE
Hi, I wish to compile the R source on SuSE, but am unable to find the gcc for it. Can anyone send me a pointer of where they got it from. Thanks, Dipti -- /\ Dipti Kamdar \\ \ Solution Engineering \ \\ / Customer Advocacy Solutions / \/ / / / / \//\ Sun Microsystems, Inc. \//\ / / Phone:(650) 786-8907 / Internal: x88907 / / /\ /Email: [EMAIL PROTECTED] / \\ \ \ \\ \/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Rd problems
On 12 Oct 2003 at 21:45, Uwe Ligges wrote: [EMAIL PROTECTED] wrote: I am running Rcmd check (Windows XP, rw1080 from cran) on a new package. This reports undocumented code objects for 14 functions, which all have their .Rd files! What might be happening? 1) You forgot to set an \alias{} (most probable) 2) There is another error in the Rd files 3) There is a bug in R (less probable) It is 3). It was caused by one unmatched brace, but the Rcmd check did not comply about unmatched braces. The first few lines of the file had the structure \name{aname} \alias{anothername} } % this is the unmatced brace . . . Kjetil Halvorsen At first check points 1-2) from above, after that repeat the complete output of Rcmd check and provide a minimal version of one of your Rd files which does not work. Uwe Ligges Kjetil Halvorsen __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Lilliefors Test
On 17 Oct 2003 at 13:59, Martina Pavlicova wrote: There is shapiro.test in package ctest, which have much better power properties than Lillefors test. So there is no need to have Lilliefors test in R, except for archeological interest. Kjetil Halvorsen Hello everybody, I would like to perform a test for normality (without specifying the mean a variance) on the sample data (80 observations). I found that Lilliefors test is appropriate. Does anybody have it programmed already, or is there a function for this test in R? Thank you very much, Martina Pavlicova -- Department of Statistics Office Phone: (614) 292-1567 1958 Neil Avenue, 304E Cockins Hall FAX: (614) 292-2096 The Ohio State UniversityE-mail: [EMAIL PROTECTED] Columbus, OH 43210-1247 www.stat.ohio-state.edu/~pavlicov __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] datetime data and plotting
So someone forgot to specify the timezone, if the current one was not wanted. However, I don't see how timezones can account for a 24hour difference as originally reported. On Fri, 17 Oct 2003, Don MacQueen wrote: I do see the described behavior, on three systems, linux R 1.8.0, Mac OS X R 1.8.0, and Solaris R 1.7.1. Plot 1 is different than plot 2; in plot 1 the points are offset to the left of the axis tick marks. datet - as.POSIXct(dates) ## 1 plot(datet,WEIGHT.KG) ## 2 plot(datet,WEIGHT.KG,xaxt='n') axis.POSIXct(1,at=datet) To investigate a bit, I made a copy of axis.POSIXct and modified it slightly to return the value of at that it calculates. I get this: 2003-10-06 17:00:00 PDT 2003-10-08 17:00:00 PDT 2003-10-10 17:00:00 PDT 2003-10-12 17:00:00 PDT 2003-10-14 17:00:00 PDT Have you heard of debug()? These are equal to midnight GMT, since my systems are currently in PDT, i.e. GMT-7. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Rd problems
On Fri, 17 Oct 2003 [EMAIL PROTECTED] wrote: On 12 Oct 2003 at 21:45, Uwe Ligges wrote: [EMAIL PROTECTED] wrote: I am running Rcmd check (Windows XP, rw1080 from cran) on a new package. This reports undocumented code objects for 14 functions, which all have their .Rd files! What might be happening? 1) You forgot to set an \alias{} (most probable) 2) There is another error in the Rd files 3) There is a bug in R (less probable) It is 3). It was caused by one unmatched brace, but the Rcmd check did not comply about unmatched braces. The first few lines of the file had the structure \name{aname} \alias{anothername} } % this is the unmatced brace That's 2) not 3). Not catching _your_ errors is not a bug (and given that .Rd does not have a parser, is inevitable). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] datetime data and plotting
At Friday 02:20 PM 10/17/2003 -0400, Gabor Grothendieck wrote: [material deleted] Time zones are not part of the problem yet POSIXt forces this extraneous complication on you. chron has no time zones in the first place and therefore allows you to work in the natural frame of the problem, avoiding subtle problems like this. This sort of thing has been discussed a number of times and I had previously suggested that chron be moved to the base or else that a timezone-less version of POSIXt be added to the base. See: https://stat.ethz.ch/pipermail/r-devel/2003-August/027269.html I also see the usefulness of a time-zone-free time/date class, but why does chron need to be moved to the base to be useful here? -- Tony Plate __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] datetime data and plotting
From: Tony Plate [EMAIL PROTECTED] I also see the usefulness of a time-zone-free time/date class, but why does chron need to be moved to the base to be useful here? Because other software makes use of times in the base. Package writers figure that what is in the base is the most available and used so that is what they use. Thus classes in the base get propagated throughout the libraries too. ___ No banners. No pop-ups. No kidding. Introducing My Way - http://www.myway.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] gcc for SuSE
Dipti Kamdar wrote: Hi, I wish to compile the R source on SuSE, but am unable to find the gcc for it. Can anyone send me a pointer of where they got it from. Thanks, Dipti Why do you think installing gcc is a topic related to R-help? You can install gcc from rpms that are on the SuSE CD/DVD. You can use YAST to look for rpms. Uwe Ligges __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] R memory and CPU requirements
Thanks for all the responses. After re-examining my data I came to realize that second order interactions would be enough in my particular case. With second order instructions I managed to fit a model with less then 512MB RAM. Thanks to everybody. --- John Fox [EMAIL PROTECTED] wrote: Dear Alexander, At 01:29 AM 10/17/2003 -0700, Alexander Sirotkin \[at Yahoo\] wrote: I agree completely. In fact, I have about 5000 observations, which should be enough. I was using 200 samples because of RAM limitations and I'm afraid to think about what amount of RAM I'll need to fit an aov() for such data. OK -- I didn't realize that you have 5000 observations. Perhaps I didn't read some of the earlier messages carefully enough. At the risk of getting you to repeat information that you've already provided, how many degrees of freedom are there in the model that you're trying to fit? I can create a 5000 by 5000 model matrix on my relatively anemic Windows machine, and surely (unless there's some specification error) your model should have many fewer df than that if it includes just the main effects and two-way interactions (or by all interactions, do you mean higher-order interactions as well?). Perhaps providing the following information would help: What is the model formula? Which variables are factors? How many levels does each factor have? Regards, John --- John Fox [EMAIL PROTECTED] wrote: Dear Alexander, If I understand you correctly, you have a sample of 200 observations. Even if you had only two factors with 40 levels each, the main effects and interactions of these factors would require about 1600 degrees of freedom -- that is, more than the number of observations. This doesn't make a whole lot of sense. I hope that this helps, John At 05:03 PM 10/16/2003 -0700, Alexander Sirotkin \[at Yahoo\] wrote: --- Deepayan Sarkar [EMAIL PROTECTED] wrote: On Thursday 16 October 2003 17:59, Alexander Sirotkin \[at Yahoo\] wrote: Thanks for all the help on my previous questions. One more (hopefully last one) : I've been very surprised when I tried to fit a model (using aov()) for a sample of size 200 and 10 variables and their interactions. That doesn't really say much. How many of these variables are factors ? How many levels do they have ? And what is the order of the interaction ? (Note that for 10 numeric variables, if you allow all interactions, then there will be a 100 terms in your model. This increases for factors.) In other words, how big is your model matrix ? (See ?model.matrix) Deepayan I see... Unfortunately, model.matrix() ran out of memory :) I have 10 variables, 6 of which are factor, 2 of which have quite a lot of levels (about 40). And I would like to allow all interactions. I understand your point about categorical variables, but still - this does not seem like too much data to me. I remmeber fitting all kinds of models (mostly decision trees) for much, much larger data sets. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help - John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: [EMAIL PROTECTED] phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox - __ Do you Yahoo!? search http://shopping.yahoo.com - John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: [EMAIL PROTECTED] phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox - __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Behavior of [[ in S vs. R
I am confused by the following difference in the behavior of R and S. Any clarification would be greatly appreciated. Jan Erik Backlund Dow AgroSciences, LLC. [EMAIL PROTECTED] R : Copyright 2003, The R Development Core Team Version 1.8.0 (2003-10-08) sw Fertility Agriculture Examination Education Catholic Courtelary80.217.0 1512 9.96 Delemont 83.145.1 6 984.84 Franches-Mnt 92.539.7 5 593.40 Moutier 85.836.5 12 733.77 Neuveville76.943.5 1715 5.16 sw[2,1] [1] 83.1 sw[[2,1]] [1] 17 and the corresponding behaviour in S S-PLUS : Copyright (c) 1988, 2001 Insightful Corp. S : Copyright Lucent Technologies, Inc. Professional Edition Version 6.0.3 Release 2 for Microsoft Windows : 2001 sw Fertility Agriculture Examination Education Catholic Courtelary 80.217.0 1512 9.96 Delemont 83.145.1 6 984.84 Franches-Mnt 92.539.7 5 593.40 Moutier 85.836.5 12 733.77 Neuveville 76.943.5 1715 5.16 sw[2,1] [1] 83.1 sw[[2,1]] [1] 83.1 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Behavior of [[ in S vs. R
On Fri, 17 Oct 2003, Backlund, Jan Erik (JE) wrote: I am confused by the following difference in the behavior of R and S. Any clarification would be greatly appreciated. sw[[2,1]] in R is short for sw[[2]][[1]], which in the case of a data frame is sw[1,2], as your example shows. -thomas Jan Erik Backlund Dow AgroSciences, LLC. [EMAIL PROTECTED] R : Copyright 2003, The R Development Core Team Version 1.8.0 (2003-10-08) sw Fertility Agriculture Examination Education Catholic Courtelary80.217.0 1512 9.96 Delemont 83.145.1 6 984.84 Franches-Mnt 92.539.7 5 593.40 Moutier 85.836.5 12 733.77 Neuveville76.943.5 1715 5.16 sw[2,1] [1] 83.1 sw[[2,1]] [1] 17 and the corresponding behaviour in S S-PLUS : Copyright (c) 1988, 2001 Insightful Corp. S : Copyright Lucent Technologies, Inc. Professional Edition Version 6.0.3 Release 2 for Microsoft Windows : 2001 sw Fertility Agriculture Examination Education Catholic Courtelary 80.217.0 1512 9.96 Delemont 83.145.1 6 984.84 Franches-Mnt 92.539.7 5 593.40 Moutier 85.836.5 12 733.77 Neuveville 76.943.5 1715 5.16 sw[2,1] [1] 83.1 sw[[2,1]] [1] 83.1 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Behavior of [[ in S vs. R
See ?[[ [[ operates recursively, so sw[[2,1]] is the same as sw[[2]][[1]] Giovanni Date: Fri, 17 Oct 2003 12:30:07 -0400 From: Backlund, Jan Erik (JE) [EMAIL PROTECTED] Sender: [EMAIL PROTECTED] Cc: Precedence: list I am confused by the following difference in the behavior of R and S. Any clarification would be greatly appreciated. Jan Erik Backlund Dow AgroSciences, LLC. [EMAIL PROTECTED] R : Copyright 2003, The R Development Core Team Version 1.8.0 (2003-10-08) sw Fertility Agriculture Examination Education Catholic Courtelary80.217.0 1512 9.96 Delemont 83.145.1 6 984.84 Franches-Mnt 92.539.7 5 593.40 Moutier 85.836.5 12 733.77 Neuveville76.943.5 1715 5.16 sw[2,1] [1] 83.1 sw[[2,1]] [1] 17 and the corresponding behaviour in S S-PLUS : Copyright (c) 1988, 2001 Insightful Corp. S : Copyright Lucent Technologies, Inc. Professional Edition Version 6.0.3 Release 2 for Microsoft Windows : 2001 sw Fertility Agriculture Examination Education Catholic Courtelary 80.217.0 1512 9.96 Delemont 83.145.1 6 984.84 Franches-Mnt 92.539.7 5 593.40 Moutier 85.836.5 12 733.77 Neuveville 76.943.5 1715 5.16 sw[2,1] [1] 83.1 sw[[2,1]] [1] 83.1 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- __ [ ] [ Giovanni Petris [EMAIL PROTECTED] ] [ Department of Mathematical Sciences ] [ University of Arkansas - Fayetteville, AR 72701 ] [ Ph: (479) 575-6324, 575-8630 (fax) ] [ http://definetti.uark.edu/~gpetris/ ] [__] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] [R-pkgs] Updated package: g.data v1.4
Version 1.4 of package g.data is available on CRAN. This upgrade is necessary for it to work under R-1.8.0, and is fully backward compatible. Description: Create and maintain delayed-data packages (DDP's). Data stored in a DDP are available on demand, but do not take up memory until requested. You attach a DDP with g.data.attach(), then read from it and assign to it in a manner similar to S-Plus, except that you must run g.data.save() to actually commit to disk. Thanks very much to [EMAIL PROTECTED] for pointing out the incompatibility. (Sorry, Brian, a direct reply to you bounced.) g.data basically creates mock packages (DDP's) to contain the data, and in R-1.8.0 a package needs a DESCRIPTION file to be recognized by .find.package(). Note you will need temporary write access to any existing (pre-1.4) DDP's, as g.data.attach() will try to create a DESCRIPTION file for any DDP that doesn't already have one. -- -- David Brahm ([EMAIL PROTECTED]) ___ R-packages mailing list [EMAIL PROTECTED] https://www.stat.math.ethz.ch/mailman/listinfo/r-packages __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] datetime data and plotting
Yes. The timezone is not the whole problem. What one would really like is that plot understands that it is being given daily data and acts accordingly, in the same way that plot already understands that its being given a factor object or a dendogram, etc. and produces the right plot. The OO way would be that the data describes itself, telling plot what it is being given so that plot can make the right choice. --- Date: Fri, 17 Oct 2003 20:47:17 +0100 (BST) From: Prof Brian Ripley [EMAIL PROTECTED] [ Add to Address Book | Block Address | Report as Spam ] To: Don MacQueen [EMAIL PROTECTED] Cc: Jacob Etches [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: [R] datetime data and plotting So someone forgot to specify the timezone, if the current one was not wanted. However, I don't see how timezones can account for a 24hour difference as originally reported. On Fri, 17 Oct 2003, Don MacQueen wrote: I do see the described behavior, on three systems, linux R 1.8.0, Mac OS X R 1.8.0, and Solaris R 1.7.1. Plot 1 is different than plot 2; in plot 1 the points are offset to the left of the axis tick marks. datet - as.POSIXct(dates) ## 1 plot(datet,WEIGHT.KG) ## 2 plot(datet,WEIGHT.KG,xaxt='n') axis.POSIXct(1,at=datet) To investigate a bit, I made a copy of axis.POSIXct and modified it slightly to return the value of at that it calculates. I get this: 2003-10-06 17:00:00 PDT 2003-10-08 17:00:00 PDT 2003-10-10 17:00:00 PDT 2003-10-12 17:00:00 PDT 2003-10-14 17:00:00 PDT Have you heard of debug()? These are equal to midnight GMT, since my systems are currently in PDT, i.e. GMT-7. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ___ No banners. No pop-ups. No kidding. Introducing My Way - http://www.myway.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] nlm, hessian, and derivatives in obj function?
I've been working on a new package and I have a few questions regarding the behaviour of the nlm function. I've been (for better or worse) using the nlm function to fit a linear model without suppling the hessian or gradient attributes in the objective function. I'm curious as to why the nlm requires 31 iterations (for the linear model), and then it doesn't work when I try to add the derivative information. I know using nlm for a linear model isn't the optimal method, but I would like to make sure the parameter estimates and the se's are matching before I attempt more difficult problems. rm(list=ls(all=TRUE)) print( running nlsystemfit models test at end...) data( kmenta ) attach( kmenta ) ##demand2 - q ~ d0 + d1 * p + d2 * d supply2 - q ~ s0 + s1 * p + s2 * f + s3 * a ##system2 - list( demand2, supply2 ) ##labels - list( Demand, Supply ) ##inst - ~ d + f + a ##sv2 - c(d0=3,s2=2.123,d2=4,s0=-2.123,s3=4.234,d1=4.234,s1=0.234) sv2 - c(s0=-2.123,s1=0.234,s2=2.123,s3=4.234) obj - function( s, eqn, data, parmnames ) { ## get the values of the parameters for( i in 1:length( parmnames ) ) { name - names( parmnames )[i] val - s[i] storage.mode( val ) - double assign( name, val ) } lhs - as.matrix( eval( as.formula( eqn )[[2]] ) ) rhs - as.matrix( eval( as.formula( eqn )[[3]] ) ) resid - crossprod( lhs - rhs ) ## just how does this work... attr( obj, value ) - resid attr( obj, gradient ) - attr( eval( deriv3( eqn, names( parmnames ) ) ), gradient ) } res - nlm( obj, sv2, hessian=T, eqn=supply2, data=kmenta, parmnames=sv2, check.analyticals=T) I haven't been able to get nlm to function as I keep getting the following error message: Error in nlm(obj, sv2, hessian = T, eqn = supply2, data = kmenta, parmnames = sv2, : invalid function value in 'nlm' optimizer If I perform the fit without the derivative information, I get the correct estimates, $minimum [1] 92.55106 $estimate [1] 58.2754312 0.1603666 0.2481333 0.2483023 $gradient [1] 8.552542e-08 9.087699e-06 5.716032e-06 2.163105e-06 $hessian [,1] [,2] [,3] [,4] [1,] 40.000 4000.762 3865.0 420.00 [2,] 4000.762 401486.918 386045.8 42007.76 [3,] 3865.000 386045.812 379593.1 39762.40 [4,] 420.000 42007.764 39762.4 5740.00 $code [1] 1 $iterations [1] 31 I was under the impression that you could also obtain the se of the parameter estimates using the sqrt( diag( res$hessian ) ), but I haven't been able to reproduce the se computed by the Jacobian se - sqrt( mse * diag( solve( crossprod( J ) ) ) )# gives the correct results... hse - sqrt( ( res$minimum / 8 ) * diag( solve( res$hessian ) ) ) # gives similar results, but why 8? I've tried to put the functionality to include the jacobian and hessian in the objective function for nlm without success as I don't know what the form of the functions will be ahead of time. and get the se from the sqrt( diag( hessian ) ), but it's nowhere close? Jeff. --- Jeff D. Hamann Hamann, Donald and Associates, Inc. PO Box 1421 Corvallis, Oregon USA 97339-1421 (office) 541-754-1428 (cell) 541-740-5988 [EMAIL PROTECTED] www.hamanndonald.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] cor function in R 1.8.0
Dear R users, Does anyone know why the following two ways to calculate correlation variance give different answers? I also obtain different answers when I use, say, spearman method in cor(). The problem does not happen in R 1.7.1 (pearson correlation only, of course in R 1.7.1). set.seed(1234) x - matrix(rnorm(10*5),10,5) y1 - cor(x) y2 - cor(x, use=pair) y1;y2 [,1][,2] [,3][,4] [,5] [1,] 1.000 -0.17528322 -0.5528785 -0.33876389 -0.49755947 [2,] -0.1752832 1. -0.2776360 -0.04840035 0.05265522 [3,] -0.5528785 -0.27763602 1.000 0.16272829 0.38392034 [4,] -0.3387639 -0.04840035 0.1627283 1. 0.85404798 [5,] -0.4975595 0.05265522 0.3839203 0.85404798 1. [,1][,2] [,3][,4] [,5] [1,] 1.000 -0.17348156 -0.5523156 -0.33585411 -0.48292994 [2,] -0.1734816 0.99965819 -0.2743654 -0.04417098 0.05661364 [3,] -0.5523156 -0.27436539 0.9990913 0.16439438 0.38457068 [4,] -0.3358541 -0.04417098 0.1643944 0.99862845 0.85389126 [5,] -0.4829299 0.05661364 0.3845707 0.85389126 0.99985356 Thanks, Ming-Chung Li __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help