[R] Math symbols for labels in Perspective plots
Dear Harsh: I stole these lines from the Persp examples and the plotmath examples. x - seq(-10, 10, length= 30) y - x f - function(x,y) { r - sqrt(x^2+y^2); 10 * sin(r)/r } z - outer(x, y, f) z[is.na(z)] - 1 op - par(bg = white) persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = lightblue) persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = lightblue, +ltheta = 120, shade = 0.75, ticktype = detailed, +xlab = X, ylab = Y, zlab = Sinc( r ) + ) - res ?plotmath title( expression(paste(plain(sin) * phi, and , + plain(cos) * phi))) Hope this helps! By the way, this is from Windows. Sincerely, Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: [EMAIL PROTECTED] Subject: [R] Math symbols for labels in Perspective plots Hi .. I would like to have math symbols in perspective plots i tried : persp(x,y,z,xlab=expression(phi)) but it plots it as phi. Thanks. Harsh __ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Regex engine types
?regex does describe this: A range of characters may be specified by giving the first and last characters, separated by a hyphen. (Character ranges are interpreted in the collation order of the current locale.) You did not tell us your locale, but based on questions from you in the past I would guess en_NZ.utf8. In that locale the collation order is wWxXyYzZ, so your surprise is explained. (It seems the PCRE code is not using the same ordering in that locale.) You may find it useful to set LC_COLLATE to C as I do: strsplit(Sys.getlocale(), ;) [[1]] [1] LC_CTYPE=en_GB LC_NUMERIC=C LC_TIME=en_GB [4] LC_COLLATE=C LC_MONETARY=en_GBLC_MESSAGES=en_GB [7] LC_PAPER=en_GB LC_NAME=CLC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB LC_IDENTIFICATION=C On Sat, 10 Jun 2006, Patrick Connolly wrote: version _ platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major2 minor2.1 year 2005 month12 day 20 svn rev 36812 language R grep([W-Z], LETTERS, value = TRUE) [1] W X Y Z That's what I'd have expected. grep([W-Z], letters, value = TRUE) [1] x y z Not what I'd have thought. However, grep([B-D], letters, value = TRUE, perl = TRUE) character(0) So what is it that standard regular expressions use that's different from Perl-type ones? The help file for grep refers to POSIX 1003.2 which looked a bit daunting to delve into. From my limited reading, it seems there are different gegex Engine Types which seems to be getting somewhat tangential to what I was working on. I could probably avoid problems if I always set perl=TRUE, but it would be good to know what basic and extended regular expressions do that's different. If someone has a quick line or two describing it, I'd be interested to know. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Maximum likelihood estimation of Regression parameters
Hi, I want to use Maximum likelihood to estimate the parameters from my regression line. I have purchased the book Applied linear statistical models from Neter, Kutner, nachtsheim Wasserman, and in one of the first chapters, they use maximum likelihood to estimate the parameters. Now I want to tried it for my self, but couldn't find the right function. In the book, they give a fixed variance to work with, but I couldn't find a function where I can estimate the predictor and where I have to give the variance. Or isn't this neccesairy? Also they calculate likelihood values for the different values, used to estimate the parameters (like a normal probability curve), is it possible to do this with R? Kind regards Bart [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Maximum likelihood estimation of Regression parameters
mle(stats4)Maximum Likelihood Estimation is it list above what you want? On 6/10/06, Bart Joosen [EMAIL PROTECTED] wrote: Hi, I want to use Maximum likelihood to estimate the parameters from my regression line. I have purchased the book Applied linear statistical models from Neter, Kutner, nachtsheim Wasserman, and in one of the first chapters, they use maximum likelihood to estimate the parameters. Now I want to tried it for my self, but couldn't find the right function. In the book, they give a fixed variance to work with, but I couldn't find a function where I can estimate the predictor and where I have to give the variance. Or isn't this neccesairy? Also they calculate likelihood values for the different values, used to estimate the parameters (like a normal probability curve), is it possible to do this with R? Kind regards Bart [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- 此致 敬礼! 华孝挺 Kenneth Hua 浙江大学核农所 杭州,中国 310029 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Regex engine types
I get the same result in a US collate ordering: strsplit(Sys.getlocale(), ;) [[1]] [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 grep([W-Z], letters, value = TRUE) [1] x y z R.version.string # Windows XP [1] Version 2.3.1 Patched (2006-06-04 r38279) On 6/10/06, Prof Brian Ripley [EMAIL PROTECTED] wrote: ?regex does describe this: A range of characters may be specified by giving the first and last characters, separated by a hyphen. (Character ranges are interpreted in the collation order of the current locale.) You did not tell us your locale, but based on questions from you in the past I would guess en_NZ.utf8. In that locale the collation order is wWxXyYzZ, so your surprise is explained. (It seems the PCRE code is not using the same ordering in that locale.) You may find it useful to set LC_COLLATE to C as I do: strsplit(Sys.getlocale(), ;) [[1]] [1] LC_CTYPE=en_GB LC_NUMERIC=C LC_TIME=en_GB [4] LC_COLLATE=C LC_MONETARY=en_GBLC_MESSAGES=en_GB [7] LC_PAPER=en_GB LC_NAME=CLC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB LC_IDENTIFICATION=C On Sat, 10 Jun 2006, Patrick Connolly wrote: version _ platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major2 minor2.1 year 2005 month12 day 20 svn rev 36812 language R grep([W-Z], LETTERS, value = TRUE) [1] W X Y Z That's what I'd have expected. grep([W-Z], letters, value = TRUE) [1] x y z Not what I'd have thought. However, grep([B-D], letters, value = TRUE, perl = TRUE) character(0) So what is it that standard regular expressions use that's different from Perl-type ones? The help file for grep refers to POSIX 1003.2 which looked a bit daunting to delve into. From my limited reading, it seems there are different gegex Engine Types which seems to be getting somewhat tangential to what I was working on. I could probably avoid problems if I always set perl=TRUE, but it would be good to know what basic and extended regular expressions do that's different. If someone has a quick line or two describing it, I'd be interested to know. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Mod function?
John, The advice in the posting guide: -- Do your homework before posting: [...] * Do help.search(keyword) with different keywords (type this at the R prompt) -- seems to work. Using help.search(mod) follow the first hit: ?+ Chuck On Fri, 9 Jun 2006, Kerpel, John wrote: Hi Folks! I need to execute a piece of R code inside a loop, say, every fourth time my counter increases (it increases by 1 unit each time.) Is there any sort of mod function in R to do this? Or is this done some other way? Many thanks! Best, john [[alternative HTML version deleted]] [ Part 3.30: Included Message ] Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] random generation for a quasi distribution
Dear R-Help, As with the rpois() function to generate random data for a poisson distribution, I need to generate random data for a quasi distribution with var=mu^2. Does anyone known how to do this? Thanks in advance, Hugues SANTIN-JANIN. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Density Estimation
On Thu, Jun 08, 2006 at 08:31:26PM +0200, Pedro Ramirez wrote: In mathematical terms the optimal bandwith for density estimation decreases at rate n^{-1/5}, while the one for distribution function decreases at rate n^{-1/3}, if n is the sample size. In practical terms, one must choose an appreciably smaller bandwidth in the second case than in the first one. Thanks a lot for your remark! I was not aware of the fact that the optimal bandwidths for density and distribution do not decrease at the same rate. Besides the computational aspect, there is a statistical one: the optimal choice of bandwidth for estimating the density function is not optimal (and possibly not even jsut sensible) for estimating the distribution function, and the stated problem is equivalent to estimation of the distribution function. The given interval 0x3 was only an example, in fact I would like to estimate the probability for intervals such as 0=x1 , 1=x2 , 2=x3 , 3=x4 , and compare it with the estimates of a corresponding histogram. In this case the stated problem is not anymore equivalent to the estimation of the distribution function. What do you think, can why not? the probabilities you are interested in are of the form F(1)-F(0), F(2)-F(1), and so on where F(.) if the cumulative distribution function (and it must be continuous, since its derivative exists). I go a ahead in this case with the optimal bandwidth for the density? Thanks a lot for your help! no best wishes, Adelchi Best wishes Pedro best wishes, Adelchi PR PR PR -- PR Gregory (Greg) L. Snow Ph.D. PR Statistical Data Center PR Intermountain Healthcare PR [EMAIL PROTECTED] PR (801) 408-8111 PR PR PR -Original Message- PR From: [EMAIL PROTECTED] PR [mailto:[EMAIL PROTECTED] On Behalf Of Pedro PR Ramirez Sent: Wednesday, June 07, 2006 11:00 AM PR To: r-help@stat.math.ethz.ch PR Subject: [R] Density Estimation PR PR Dear R-list, PR PR I have made a simple kernel density estimation by PR PR x - c(2,1,3,2,3,0,4,5,10,11,12,11,10) PR kde - density(x,n=100) PR PR Now I would like to know the estimated probability that a new PR observation falls into the interval 0x3. PR PR How can I integrate over the corresponding interval? PR In several R-packages for kernel density estimation I did not PR found a corresponding function. I could apply Simpson's Rule for PR integrating, but perhaps somebody knows a better solution. PR PR Thanks a lot for help! PR PR Pedro PR PR _ PR PR __ PR R-help@stat.math.ethz.ch mailing list PR https://stat.ethz.ch/mailman/listinfo/r-help PR PLEASE do read the posting guide! PR http://www.R-project.org/posting-guide.html PR PR PR __ PR R-help@stat.math.ethz.ch mailing list PR https://stat.ethz.ch/mailman/listinfo/r-help PR PLEASE do read the posting guide! PR http://www.R-project.org/posting-guide.html PR _ Don't just search. Find. Check out the new MSN Search! http://search.msn.com/ -- Adelchi Azzalini [EMAIL PROTECTED] Dipart.Scienze Statistiche, Università di Padova, Italia tel. +39 049 8274147, http://azzalini.stat.unipd.it/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nested mixed-effect model: variance components
I have seen no reply to this, so I will offer a couple of comments in spite of the fact that I know very little about aov other than it is old and has largely been superceded by lme in the nlme package. I've replied to many posts on random and mixed effects over the past few years, and I only remember one case where 'aov' returned an answer that could not have been obtained easily from 'lme'. That one application involved estimating a saturated model from perfectly balanced experimental data. 'lme' refused to return an answer, at least as I specified the model, and 'aov' returned numbers from which anyone familiar with the theory for that special case could compute the desired F ratios and p-values. In all other cases that I remember, 'aov' would either return the same answer, or much more commonly, a substantively inferior answer -- if it were feasible to use 'aov' at all. I mention this, because the simplest way I can think of to check your answer is to suggest you try to work it using 'lme'. To learn how to do this, I strongly encourage you to consult Pinheiro and Bates (2000) Mixed-Effects Models in S and S-PLUS (Springer). Bates has been for many years one of the leading contributors in this area and is the primary author of the 'nlme', 'lme4' and 'Matrix packages to support working with these kinds of models. This book discusses both how to fit these kinds of models as well as how to produce plots of various kinds that can be very helpful for explaining the experimental results to others as well as diagnosing potential problem. Moreover, the R scripts discussed in the book are available in files called ch01.R, ch02.R, ..., ch06.R, ch08.R in a subfolder '~library\nlme\scripts' of the directory in which R is installed on your hard drive. This makes it much easier to work through the examples in the book one line at a time, experimenting at with modifications. In addition, there is one point I know where the R syntax differs from that in the book: S-Plus will accept x^2 in a formula as meaning the square of a numeric variable; R will not. To specify a square in R, you need something like I(x^2). When I copied the commands out of the book, I had serious trouble getting answers like those in the book until I identified and corrected this difference in syntax. If you use the script files, they provide the R syntax. I'm not certain, but I believe the following should estimate the model you wrote below: fit - lme(fixed=COVER ~ HABITAT, random = ~1|LAGOON/HABITAT, data=cov). For comparison, I refer you to Pinheiro and Bates, p. 47, where I find the following: fm1Oats - lme( yield ~ ordered(nitro) * Variety, data = Oats, random = ~ 1 | Block/Variety ) There are 3 Varieties and 6 Blocks in this Oats data.frame. The fixed effect of Variety has therefore 2 degrees of freedom. However, the random effect of Variety occurs in 18 levels, being all the 6*3 Block:Variety combinations. You can see this by examining 'str(fm1Oats)'. If you want to know how much variation is due to lagoons? habitats? lagoons*habitat? transects?, this model will NOT tell you that, and I don't know how to answer that question using 'lme'. I was able to answer a question like that using 'lmer' associated with the 'lme4' and 'Matrix' packages. Unfortunately, these packages have some names conflicts with 'nlme', and the simplest way I know to change from one to the other is to q() and restart R Before I did, however, I made a local copy of the Oats data.frame. After I did that, I seemed to get sensible numbers from the following: library(lme4) fm1Oats4 - lmer(yield~ ordered(nitro) * Variety +(1|Block)+(1|Variety)+(1|Block:Variety), data=Oats) For both lme and lmer, the default method is REML (restricted maximum likelihood). This is what you want for estimation. For testing random effects, you still want REML, but you should adjust the degrees of freedom of the reference F distribution as discussed in section 2.4 of Pinheiro and Bates; this also applies to confidence intervals for the random effects. For testing fixed effects, you should use method = 'ML'. lme4 is newer than nlme and does not currently have available the complete set of helper functions for plotting, etc. Thus, you will likely want to use both. For documentation on 'lmer', you should still start with Pinheiro and Bates for the general theory, then refer to the vignettes associated with the mlmRev and lme4 packages; if you don't know about vignettes RSiteSearch(graves vignette) will lead you to http://finzi.psych.upenn.edu/R/Rhelp02a/archive/67006.html; and other replies to r-help where I've described how to use them. You will also want the relevant article by Doug Bates in R News. To find that, go www.r-project.org -
[R] sparse matrix, rnorm, malloc
Hi, I'm Sorry for any cross-posting. I've reviewed the archives and could not find an exact answer to my question below. I'm trying to generate very large sparse matrices ( 1% non-zero entries per row). I have a sparse matrix function below which works well until the row/col count exceeds 10,000. This is being run on a machine with 32G memory: sparse_matrix - function(dims,rnd,p) { ptm - proc.time() x - round(rnorm(dims*dims),rnd) x[((abs(x) - p) 0)] - 0 y - matrix(x,nrow=dims,ncol=dims) proc.time() - ptm } When trying to generate the matrix around 20,000 rows/cols on a machine with 32G of memory, the error message I receive is: R(335) malloc: *** vm_allocate(size=324096) failed (error code=3) R(335) malloc: *** error: can't allocate region R(335) malloc: *** set a breakpoint in szone_error to debug R(335) malloc: *** vm_allocate(size=324096) failed (error code=3) R(335) malloc: *** error: can't allocate region R(335) malloc: *** set a breakpoint in szone_error to debug Error: cannot allocate vector of size 3125000 Kb Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Last error line is obvious. Question: on machine w/32G memory, why can't it allocate a vector of size 3125000 Kb? When trying to generate the matrix around 30,000 rows/cols, the error message I receive is: Error in rnorm(dims * dims) : cannot allocate vector of length 9 Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Last error line is obvious. Question: is this 9 bytes? kilobytes? This error seems to be specific now to rnorm, but it doesn't indicate the length metric (b/Kb/Mb) as it did for 20,000 rows/cols. Even if this Mb, why can't this be allocated on a machine with 32G free memory? When trying to generate the matrix with over 50,000 rows/cols, the error message I receive is: Error in rnorm(n, mean, sd) : invalid arguments In addition: Warning message: NAs introduced by coercion Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Same. Why would it generate different errors in each case? Code fixes? Any simple ways to generate sparse matrices which would avoid above problems? Thanks in advance, Gavin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sparse matrix, rnorm, malloc
You need to look at the packages specifically designed for sparse matrices: SparseM and Matrix. url:www.econ.uiuc.edu/~rogerRoger Koenker email [EMAIL PROTECTED] Department of Economics vox:217-333-4558University of Illinois fax:217-244-6678Champaign, IL 61820 On Jun 10, 2006, at 12:53 PM, g l wrote: Hi, I'm Sorry for any cross-posting. I've reviewed the archives and could not find an exact answer to my question below. I'm trying to generate very large sparse matrices ( 1% non-zero entries per row). I have a sparse matrix function below which works well until the row/col count exceeds 10,000. This is being run on a machine with 32G memory: sparse_matrix - function(dims,rnd,p) { ptm - proc.time() x - round(rnorm(dims*dims),rnd) x[((abs(x) - p) 0)] - 0 y - matrix(x,nrow=dims,ncol=dims) proc.time() - ptm } When trying to generate the matrix around 20,000 rows/cols on a machine with 32G of memory, the error message I receive is: R(335) malloc: *** vm_allocate(size=324096) failed (error code=3) R(335) malloc: *** error: can't allocate region R(335) malloc: *** set a breakpoint in szone_error to debug R(335) malloc: *** vm_allocate(size=324096) failed (error code=3) R(335) malloc: *** error: can't allocate region R(335) malloc: *** set a breakpoint in szone_error to debug Error: cannot allocate vector of size 3125000 Kb Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Last error line is obvious. Question: on machine w/32G memory, why can't it allocate a vector of size 3125000 Kb? When trying to generate the matrix around 30,000 rows/cols, the error message I receive is: Error in rnorm(dims * dims) : cannot allocate vector of length 9 Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Last error line is obvious. Question: is this 9 bytes? kilobytes? This error seems to be specific now to rnorm, but it doesn't indicate the length metric (b/Kb/Mb) as it did for 20,000 rows/cols. Even if this Mb, why can't this be allocated on a machine with 32G free memory? When trying to generate the matrix with over 50,000 rows/cols, the error message I receive is: Error in rnorm(n, mean, sd) : invalid arguments In addition: Warning message: NAs introduced by coercion Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Same. Why would it generate different errors in each case? Code fixes? Any simple ways to generate sparse matrices which would avoid above problems? Thanks in advance, Gavin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nested mixed-effect model: variance components
see inline Eric Pante wrote: Hi Spencer, First, thank you very much for taking the time to write this detailed reply ! I did try exactly the formula you suggested: fit - lme(fixed=COVER ~ HABITAT, random = ~1|LAGOON/HABITAT, data=cov) before writing my post, and indeed, neither summary() or anova() returned the sums of squares to assess variance components. Do you want sums of squares or estimates of variance components? I believe that for over half a century, there is been a substantial consensus among leading experts in statistical methods that maximum likelihood (or Bayesian posterior where an adequate prior exists) is theoretically the best method for most statistical questions. With mixed effects, this is interpreted as requiring the maximization of the marginal likelihood (often after integrating out fixed effects to obtain the restricted likelihood maximized with method = REML). However, until the last couple of decades, ML or REML was not computationally feasible for many cases lacking balance. This led to an extensive literature on alternative methods like MINQUE. These methods are now substantially obsolete as far as I know except in cases with appropriate balance where they produce exactly the same answers as ML or REML. If you want estimates of the variance components, then use 'lme': With appropriately balanced data sets, these would be exactly what you would get by writing out the expected mean squares and solving for the variance components. Where the balance is lacking, the other methods will generally produce less efficient estimates of what you really want. If you want sums of squares for testing, do your testing as described in Pinheiro and Bates. Except in the saturated case I mentioned, you should get identical or superior estimates from lme and lmer than from aov. If you want sums of squares for something other than an intermediate step for estimation or testing, please explain. Be careful what you ask for, because you might get it -- and it might not be what you want. how do you specify in the formula that you want a nested approach (I will check the Pinheiro and Bates book) ? For these two reasons, I had the feeling that aov might be the way to go ... I understood you to say that you thought habitat should be nested within lagoon, but you also want to know how much variation is due to lagoons? habitats? lagoons*habitat?. That sounds to me like you want to know how to clean the spark plugs on your bicycle. Have you studied the Oats example in my post? (See also http://finzi.psych.upenn.edu/R/Rhelp02a/archive/63788.html;.) Bon Chance! J’espère que ceci vous aide. Spencer Graves I will try your suggestions concerning the lme4 package. Thanks again ! eric pante Eric Pante Spencer Graves wrote: I have seen no reply to this, so I will offer a couple of comments in spite of the fact that I know very little about aov other than it is old and has largely been superceded by lme in the nlme package. I've replied to many posts on random and mixed effects over the past few years, and I only remember one case where 'aov' returned an answer that could not have been obtained easily from 'lme'. That one application involved estimating a saturated model from perfectly balanced experimental data. 'lme' refused to return an answer, at least as I specified the model, and 'aov' returned numbers from which anyone familiar with the theory for that special case could compute the desired F ratios and p-values. In all other cases that I remember, 'aov' would either return the same answer, or much more commonly, a substantively inferior answer -- if it were feasible to use 'aov' at all. I mention this, because the simplest way I can think of to check your answer is to suggest you try to work it using 'lme'. To learn how to do this, I strongly encourage you to consult Pinheiro and Bates (2000) Mixed-Effects Models in S and S-PLUS (Springer). Bates has been for many years one of the leading contributors in this area and is the primary author of the 'nlme', 'lme4' and 'Matrix packages to support working with these kinds of models. This book discusses both how to fit these kinds of models as well as how to produce plots of various kinds that can be very helpful for explaining the experimental results to others as well as diagnosing potential problem. Moreover, the R scripts discussed in the book are available in files called ch01.R, ch02.R, ..., ch06.R, ch08.R in a subfolder '~library\nlme\scripts' of the directory in which R is installed on your hard drive. This makes it much easier to work through the examples in the book one line at a time, experimenting at with modifications. In addition, there is one point I know where the R syntax
[R] R usage for log analysis
Hello, Is there any software project that uses R to do log file analisys? thanks gabi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Calculating survival for set time intervals
Hello friends and fellow R users, I have successfully tabulated and entered my survival data into R and have generated survival curves. But I would like to be able to determine what the survival rates are now at one month, three months, six months and one year. I have a data set, via.wall, which I have entered into R, and which generates the following Surv object: Surv(Days,Status==1) [1] 648+ 3 109 241 997 849+ 1053+ 539+ 121 42 490 21 870+ [16] 175 20 434 289 826+ 831+ 664 698+ 5 24 655+ 187+ 85+ 65+ [31] 547+ 81 55+ 69 499+ 448+ 0 158+ 31 246+ 230+ 19 118+ 54 [46] 48+ 45+ 21+ 670+ 585 558+ 544+ 494 481+ 474+ 472+ 461 447 446+ 443+ [61] 429+ 423+ 401 395+ 390 390+ 389+ 383+ 383+ 373+ 362+ 354 344+ 342 336+ [76] 335+ 326+ 306 300+ 292 284+ 280+ 271 246+ 237+ 234 233+ 233 230+ 230+ [91] 226+ 225+ 218+ 215 211+ 199+ 191+ 191 190+ 184+ 169+ 163+ 161+ 153 150 [106] 129+ 110+ 107+ 100+ 84+ 77+ 69+ 52+ 38+ 11+ names(wall.via) [1] Description Patient Physician MRN Age [6] Status DaysCr INR BR [11] MELDtype I can guess pretty accurately by looking at the graph what the survival rates are at each interval, but I would like to understand how to instruct R to calculate it. Hope I have made this clear. I am just a beginner, so forgive me if this is trivial. It just isn't clear to me. Thanks, Greg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] random generation for a quasi distribution
Any quasi distribution is not well defined, which is why it is called qausi. Users decide that its likelihood looks like some other distribution, e.g. a Poisson, but it is overdispersed. If you want to simulate a quasi binomial, you have to select a plausible overdispersion mechanism. You could for example decide that the Poisson parameter follows a lognormal or a gamma distribution. Then simulate the lognormal or gamma random variables and use those to simulate the poisson. Example: set.seed(5) (Lam - rlnorm(2, sdlog=9)) [1] 5.168803e-04 2.576182e+05 rpois(2, Lam) [1] 0 257885 hope this helps Spencer Graves hugues wrote: Dear R-Help, As with the rpois() function to generate random data for a poisson distribution, I need to generate random data for a quasi distribution with var=mu^2. Does anyone known how to do this? Thanks in advance, Hugues SANTIN-JANIN. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sparse matrix, rnorm, malloc
On Sat, 10 Jun 2006, g l wrote: Hi, I'm Sorry for any cross-posting. I've reviewed the archives and could not find an exact answer to my question below. I'm trying to generate very large sparse matrices ( 1% non-zero entries per row). I have a sparse matrix function below which works well until the row/col count exceeds 10,000. This is being run on a machine with 32G memory: sparse_matrix - function(dims,rnd,p) { ptm - proc.time() x - round(rnorm(dims*dims),rnd) x[((abs(x) - p) 0)] - 0 y - matrix(x,nrow=dims,ncol=dims) proc.time() - ptm } When trying to generate the matrix around 20,000 rows/cols on a machine with 32G of memory, the error message I receive is: R(335) malloc: *** vm_allocate(size=324096) failed (error code=3) R(335) malloc: *** error: can't allocate region R(335) malloc: *** set a breakpoint in szone_error to debug R(335) malloc: *** vm_allocate(size=324096) failed (error code=3) R(335) malloc: *** error: can't allocate region R(335) malloc: *** set a breakpoint in szone_error to debug Error: cannot allocate vector of size 3125000 Kb Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Last error line is obvious. Question: on machine w/32G memory, why can't it allocate a vector of size 3125000 Kb? When trying to generate the matrix around 30,000 rows/cols, the error message I receive is: Error in rnorm(dims * dims) : cannot allocate vector of length 9 Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Last error line is obvious. Question: is this 9 bytes? kilobytes? This error seems to be specific now to rnorm, but it doesn't indicate the length metric (b/Kb/Mb) as it did for 20,000 rows/cols. Even if this Mb, why can't this be allocated on a machine with 32G free memory? This is a length of 9, as it says. Please read ?Memory-limits for the limits in force. (A numeric vector of that length would be over 2^32 bytes and so exceed the address space of a 32-bit executable.) You have not told us your platform or other basic facts: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html and had you heeded that request we would have had a lot more to go on. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sparse matrix, rnorm, malloc
As an example of how one might do this sort of thing in SparseM ignoring the rounding aspect... require(SparseM) require(msm) #for rtnorm sm - function(dim,rnd,q){ n - rbinom(1, dim * dim, 2 * pnorm(q) - 1) ia - sample(dim,n,replace = TRUE) ja - sample(dim,n,replace = TRUE) ra - rtnorm(n,lower = -q, upper = q) A - new(matrix.coo, ia = as.integer(ia), ja = as.integer (ja), ra = ra, dimension = as.integer(c(dim,dim))) A - as.matrix.csr(A) } For dim = 5000 and q = .03 which exceeds Gavin's suggested 1 percent density, this takes about 30 seconds on my imac and according to Rprof about 95 percent of that (total) time is spent generating the truncated normals. Word of warning: pushing this too much further gets tedious since the number of random numbers grows like dim^2. For example, dim = 20,000 and q = .02 takes 432 seconds with again 93% of the total time spent in rnorm and rtnorm... url:www.econ.uiuc.edu/~rogerRoger Koenker email [EMAIL PROTECTED] Department of Economics vox:217-333-4558University of Illinois fax:217-244-6678Champaign, IL 61820 On Jun 10, 2006, at 12:53 PM, g l wrote: Hi, I'm Sorry for any cross-posting. I've reviewed the archives and could not find an exact answer to my question below. I'm trying to generate very large sparse matrices ( 1% non-zero entries per row). I have a sparse matrix function below which works well until the row/col count exceeds 10,000. This is being run on a machine with 32G memory: sparse_matrix - function(dims,rnd,p) { ptm - proc.time() x - round(rnorm(dims*dims),rnd) x[((abs(x) - p) 0)] - 0 y - matrix(x,nrow=dims,ncol=dims) proc.time() - ptm } When trying to generate the matrix around 20,000 rows/cols on a machine with 32G of memory, the error message I receive is: R(335) malloc: *** vm_allocate(size=324096) failed (error code=3) R(335) malloc: *** error: can't allocate region R(335) malloc: *** set a breakpoint in szone_error to debug R(335) malloc: *** vm_allocate(size=324096) failed (error code=3) R(335) malloc: *** error: can't allocate region R(335) malloc: *** set a breakpoint in szone_error to debug Error: cannot allocate vector of size 3125000 Kb Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Last error line is obvious. Question: on machine w/32G memory, why can't it allocate a vector of size 3125000 Kb? When trying to generate the matrix around 30,000 rows/cols, the error message I receive is: Error in rnorm(dims * dims) : cannot allocate vector of length 9 Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Last error line is obvious. Question: is this 9 bytes? kilobytes? This error seems to be specific now to rnorm, but it doesn't indicate the length metric (b/Kb/Mb) as it did for 20,000 rows/cols. Even if this Mb, why can't this be allocated on a machine with 32G free memory? When trying to generate the matrix with over 50,000 rows/cols, the error message I receive is: Error in rnorm(n, mean, sd) : invalid arguments In addition: Warning message: NAs introduced by coercion Error in round(rnorm(dims * dims), rnd) : unable to find the argument 'x' in selecting a method for function 'round' * Same. Why would it generate different errors in each case? Code fixes? Any simple ways to generate sparse matrices which would avoid above problems? Thanks in advance, Gavin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] date.mdy in date package
Evidently, your 'trimmed_dates' was NOT a a Julian date value, as returned by 'mdy.date()', number of days since 1/1/1960. My standard references for this kind of thing are the zoo vignette and the R News article, Gabor Grothendieck and Thomas Petzoldt. R help desk: Date and time classes in R. R News, 4(1):29-32, June 2004 (www.r-project.org - Documentation: Newsletter). For more on vignettes and zoo in particular, see http://finzi.psych.upenn.edu/R/Rhelp02a/archive/67006.html;. With my limited knowledge of date-time formates, I might approach your problem as follows: Lines - + 1 1/2/20032 1/3/20033 1/6/20034 1/7/20035 1/8/20036 1/9/20037 1/10/20038 1/13/20039 1/14/2003 (DF - read.table(textConnection(Lines))) V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 1 1 1/2/2003 2 1/3/2003 3 1/6/2003 4 1/7/2003 5 1/8/2003 6 1/9/2003 7 V14 V15 V16 V17 V18 1 1/10/2003 8 1/13/2003 9 1/14/2003 (Dates - sapply(DF[2*(1:9)], as.character)) V2 V4 V6 V8 V10 V12 1/2/2003 1/3/2003 1/6/2003 1/7/2003 1/8/2003 1/9/2003 V14 V16 V18 1/10/2003 1/13/2003 1/14/2003 (Dates. - strptime(Dates, %m/%d/%Y)) [1] 2003-01-02 2003-01-03 2003-01-06 2003-01-07 2003-01-08 [6] 2003-01-09 2003-01-10 2003-01-13 2003-01-14 class(Dates.) [1] POSIXt POSIXlt names(Dates.) [1] sec min hour mday mon year wday yday isdst Thus, the 'POSIXlt' format is similar to the output of 'date.mdy', except that it has other attributes, which you could ignore. Getting month / days with leading zeros is a formatting issue. If you still can't figure that out after studying the Grothendieck and Petzoldt article and the zoo vignette, please submit another post. hope this helps. Spencer Graves Brian Scholl wrote: I'm having a problem with output from date.mdy in the date package. Goal: to take a long vector of dates of the form 01/22/99 and extract values month=01, day=22, year=1999. I am providing the vector of class dates in the attached file to date.mdy: mdy_dates-date.mdy(trimmed_dates) The first few obs of the attached vector of dates are: 1 1/2/20032 1/3/20033 1/6/20034 1/7/20035 1/8/20036 1/9/20037 1/10/20038 1/13/20039 1/14/2003 I am expecting to get a list with vectors for the day, month, and year. The function appears to spit out the format I expect (though I would like months/days with leading zeros) but incorrect values, But what I get in the first few lines is something like: mdy_dates $month [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 [26] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 [51] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 $day [1] 1 2 5 6 7 8 9 12 13 14 15 16 19 20 21 22 23 26 27 28 29 30 2 3 4 [26] 5 6 9 10 11 12 13 16 17 18 19 20 23 24 25 26 27 2 3 4 5 6 9 10 11 [51] 12 13 16 17 18 19 20 23 24 25 26 27 30 31 1 2 3 6 7 8 9 10 13 14 15 $year [1] 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 [16] 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 [31] 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 1993 So: The first day of my list of dates is Jan 2, 2003, but in the list output the first date is Jan 1, 1993. Have I incorrectly formatted my input, or is there some other problem? Thanks in advance. __ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] question about using ancova in R to do variable selection
Have you tried Bioconductor (www.bioconductor.org)? I haven't seen any replies to your question, and I know next to nothing about microarray data. I understand that Bioconductor specializes in microarray and related data, has a listserve, etc. hope this helps, Spencer Graves James Anderson wrote: Hi, I have a microarray data which has about 1 genes and 15 measurements. The 15 measurements contain 6 control, 3 low dose, 3 medium dose, 3 high dose. If I treat the dose level as a continuous variable, how can I use ANCOVA to do gene selection? (I mean to select those genes that response to different level of doses differently?). Thank you very much! James __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] using several columns from a Table in a procedure
Dear R-friends I have a table with more than 50 columns (variables). Many of them are numeric and others are of type char. I would like repeat a group of command using only a set of the numeric variables, excluding others (for example V8, V12 etc) and not using the char ones. As a sample I want: x11() plot (X, v2) model_v2-glm (v2~X) lines (X, predict(model_v2)) x11() plot (X, v3) model_v3-glm (v3~X) lines (X, predict(model_v3)) x11() plot (X, v4) model_v4-glm (v4~X) lines (X, predict(model_v4)) - Look that I even use de X variable agains the other ones. How can I do that? Another question is how can I change the name of a column in a table? Thanks A Lot Miltinho __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] assign sequence numbers by group of value
Dear R users: I would like to assign sequence numbers based on group of value but cannot find an easy way. Here is a simple instance: id = c('a','a','a','b','c','c') id [1] a a a b c c I hope to create a corresponding vector as -- [1] 1 2 3 1 1 2 That is, in group a the number begins with 1 and then continues to plus 1 until it happens to the next group. For b It goes back to 1. Because there is only one b, it begins to be 1 again for the first c following, etc. Could someone advise me how to do this in programming rather than manually? Thanks a lot. Tony C. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] assign sequence numbers by group of value
id [1] a a a b c c x - split(id, id) # separate by unique values x - lapply(x, seq) # generate the sequence numbers x $a [1] 1 2 3 $b [1] 1 $c [1] 1 2 x - unsplit(x, id) # make back into a vector x [1] 1 2 3 1 1 2 On 6/10/06, Tony Chu [EMAIL PROTECTED] wrote: Dear R users: I would like to assign sequence numbers based on group of value but cannot find an easy way. Here is a simple instance: id = c('a','a','a','b','c','c') id [1] a a a b c c I hope to create a corresponding vector as -- [1] 1 2 3 1 1 2 That is, in group a the number begins with 1 and then continues to plus 1 until it happens to the next group. For b It goes back to 1. Because there is only one b, it begins to be 1 again for the first c following, etc. Could someone advise me how to do this in programming rather than manually? Thanks a lot. Tony C. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R usage for log analysis
I heared some project use Perl to analysis log file. And I don't think it's suit to analysis log file for R. On 6/11/06, Gabriel Diaz [EMAIL PROTECTED] wrote: Hello, Is there any software project that uses R to do log file analisys? thanks gabi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- 此致 敬礼! 华孝挺 Kenneth Hua 浙江大学核农所 杭州,中国 310029 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html