Re: [R] Neural Nets (nnet) - evaluating success rate of predictions
On 5/7/07, Bert Gunter [EMAIL PROTECTED] wrote: Folks: If I understand correctly, the following may be pertinent. Note that the procedure: min.nnet = nnet[k] such that error rate of nnet[k] = min[i] {error rate(nnet(training data) from ith random start) } does not guarantee a classifier with a lower error rate on **new** data than any single one of the random starts. That is because you are using the same training set to choose the model (= nnet parameters) as you are using to determine the error rate. I know it's tempting to think that choosing the best among many random starts always gets you a better classifier, but it need not. The error rate on the training set for any classifier -- be it a single one or one derived in some way from many -- is a biased estimate of the true error rate, so that choosing a classifer on this basis does not assure better performance for future data. In particular, I would guess that choosing the best among many (hundreds/thousands) random starts is probably almost guaranteed to produce a poor predictor (ergo the importance of parsimony/penalization). I would appreciate comments from anyone, pro or con, with knowledge and experience of these things, however, as I'm rather limited on both. I agree - it's never a good idea to use the same data for creating your classifier and determining it's effectiveness (I meant to say pick the one with the lowest error rate on your TEST data). The reason to choose from many random starts is that fitting a given neural network _model_ (ie. input x, n nodes, ...) is very hard due to the large overparameterisation of the problem space. For example, the parameters for one node in a given layer can be exchanged with the parameters of another node (as well as the parameters that use those nodes in next layer), without changing the overall model. This makes it very hard to optimise, and nnet in R often gets stuck in local minima. Looking at what individual nodes are doing, you often see examples where some nodes contribute nothing to the overall classification. The random starts aren't to find different models but to find the parameters for the given model that fits best. And following this line of argument, you would probably want to use the internal criterion value, rather than some external measure of accuracy. The simple answer to the question of obtaining the error rate using validation data is: Do whatever you like to choose/fit a classifier on the training set. **Once you are done,** the estimate of your error rate is the error rate you get on applying that classifier to the validation set. But you can do this only once! If you don't like that error rate and go back to finding a a better predictor in some way, then your validation data have now been used to derive the classifier and thus has become part of the training data, so any further assessment of the error rate of a new classifier on it is now also a biased estimate. You need yet new validation data for that. Understanding that that estimate is biased is important, but in practice, do people really care that much? If you have looked at a single plot of your data and used that to inform your choice of the classifier your estimates will already be biased (but if you have used other knowledge of the data or subject area, you might expect them to be biased in a positive direction). Are the estimates of model really the most important thing? Surely an understanding of the problem/data is what you are really trying to gain. Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a new column
Hallo, I just now a solution for da data frame. I'm not sure if this is what you want. Just try if it helps. Here an example of my code where I added a column: df - rbind( c(fred,mary,4),c(fred,mary,7), c(fred,mary,9),c(barney,liz,3), c(barney,liz,5)) df - data.frame(df) colnames(df) - c(father,mother,child.age) # adding column df - data.frame(df,weddingdate=c(Dec 12th, 1980,Dec 12th, 1980, Dec 12th, 1980,Apr 9th, 2003, Apr 9th, 2003)) df The R-Gui Result: father mother child.ageweddingdate 1 fred mary 4 Dec 12th, 1980 2 fred mary 7 Dec 12th, 1980 3 fred mary 9 Dec 12th, 1980 4 barneyliz 3 Apr 9th, 2003 5 barneyliz 5 Apr 9th, 2003 Caution: the number of entries in adding column must correspond to the number of rows in your existing data frame df (here 5) Try this soultion, Corinna -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von raymond chiruka Gesendet: Montag, 7. Mai 2007 16:28 An: r Betreff: [R] creating a new column hie l would like to create a 6th column actual surv time from the following data the condition being if censoringTimesurvivaltime then actual survtime =survival time else actual survtime =censoring time the code l used to create the data is s=2 while(s!=0){ n=20 m-matrix(nrow=n,ncol=4) colnames(m)=c(treatmentgrp,strata,censoringTime,survivalTime) for(i in 1:20) m[i,]-c(sample(c(1,2),1,replace=TRUE),sample(c(1,2),1,replace=TRUE),rexp(1,.007),rexp(1,.002)) m-cbind(m,0) m[m[,3]m[,4],5]-1 colnames(m)[5]-censoring print(m) s=s-1 treatmentgrp strata censoringTime survivalTime censoring [1,] 1 1 1.0121591137.80922 0 [2,] 2 2 32.971439 247.21786 0 [3,] 2 1 85.758253 797.04949 0 [4,] 1 1 16.999171 78.92309 0 [5,] 2 1 272.909896 298.21483 0 [6,] 1 2 138.230629 935.96765 0 [7,] 2 2 91.529859 141.08405 0 l keep getting an error message when i try to create the 6th column - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad optimization solution
It seems that there is here a problem of reliability, as one never knows whether the solution provided by R is correct or not. In the case that I reported, it is fairly simple to see that the solution provided by R (without any warning!) is incorrect, but, in general, that is not so simple and one may take a wrong solution as a correct one. Paul On 5/8/07, Ravi Varadhan [EMAIL PROTECTED] wrote: Your function, (x1-x2)^2, has zero gradient at all the starting values such that x1 = x2, which means that the gradient-based search methods will terminate there because they have found a critical point, i.e. a point at which the gradient is zero (which can be a maximum or a minimum or a saddle point). However, I do not why optim converges to the boundary maximum, when analytic gradient is supplied (as shown by Sundar). Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith Sent: Monday, May 07, 2007 6:26 PM To: R-help Subject: Re: [R] Bad optimization solution On 5/7/07, Paul Smith [EMAIL PROTECTED] wrote: I think the problem is the starting point. I do not remember the details of the BFGS method, but I am almost sure the (.5, .5) starting point is suspect, since the abs function is not differentiable at 0. If you perturb the starting point even slightly you will have no problem. Paul Smith [EMAIL PROTECTED] To Sent by: R-help r-help@stat.math.ethz.ch [EMAIL PROTECTED] cc at.math.ethz.ch Subject [R] Bad optimization solution 05/07/2007 04:30 PM Dear All I am trying to perform the below optimization problem, but getting (0.5,0.5) as optimal solution, which is wrong; the correct solution should be (1,0) or (0,1). Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux). Thanks in advance, Paul -- myfunc - function(x) { x1 - x[1] x2 - x[2] abs(x1-x2) } optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control= list(fnscale=-1)) Yes, with (0.2,0.9), a correct solution comes out. However, how can one be sure in general that the solution obtained by optim is correct? In ?optim says: Method 'L-BFGS-B' is that of Byrd _et. al._ (1995) which allows _box constraints_, that is each variable can be given a lower and/or upper bound. The initial value must satisfy the constraints. This uses a limited-memory modification of the BFGS quasi-Newton method. If non-trivial bounds are supplied, this method will be selected, with a warning. which only demands that the initial value must satisfy the constraints. Furthermore, X^2 is everywhere differentiable and notwithstanding the reported problem occurs with myfunc - function(x) { x1 - x[1] x2 - x[2] (x1-x2)^2 } optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control= list(fnscale=-1)) Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] package ltm -- version 0.8-0
Dear R-users, I'd like to announce the release of the new version of package `ltm' (i.e., ltm_0.8-0 soon available from CRAN) for Item Response Theory analyses. This package provides a flexible framework for analyzing dichotomous and polytomous data under IRT, including the Rasch model, the Two-Parameter Logistic model, Birnbaum's Three-Parameter model, the Latent Trait model with up to two latent variables (allowing also for nonlinear terms), and Samejima's Graded Response model. Furthermore, supporting functions for descriptive statistics, goodness-of-fit, ability estimation and plotting are available. New features include: * The new functions person.fit() and item.fit() compute p-values for person- and item-fit statistics for IRT models for dichotomous data. The `simulate.p.value' argument enables the computation of p-values based on a Monte Carlo procedure. * The new function unidimTest() checks the unidimensionality assumption for dichotomous data IRT models, using a Modified Parallel Analysis. * The new function testEquatingData() prepares data-sets for test equating by common items. In particular, two types of common item equating are included: alternate form equating (where common and unique items are analyzed simultaneously) and across sample equating (where different sets of unique items are analyzed separately based on previously calibrated anchor items). * grm() now works with the available cases when incomplete data (i.e., in the presence of NAs) are analyzed. * better algorithms, for Missing At Random missing data mechanisms, have been written for grm(), ltm(), rasch() and tpm(). * a residuals() method has been added for `grm', `ltm', `rasch', and `tpm' objects that computes Pearson-type residuals. * factor.scores() and fitted() methods for classes `grm', `ltm', `rasch', and `tpm' allow now for NAs in the `resp.patterns' argument, enabling thus the computation of ability estimates and fitted values for incomplete response patterns. * the fitted() method now allows also for the computation of marginal and conditional (on the latent variable(s)) probabilities; this feature is controlled by the new `type' argument. * for more details and other news, check the CHANGES file that ships with the package. More information as well as .R files illustrating the capabilities of the package can be found in the Rwiki page of `ltm' available at: http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:ltm. Future plans include the development of functions for fitting Bock's Nominal Response model and the option for Differential Item Functioning. I'd like also to thank all users of `ltm' for providing valuable feedback, and welcome any additional feedback (questions, suggestions, bug-reports, etc.). Best, Dimitris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Weighted least squares
Dear all, I'm struggling with weighted least squares, where something that I had assumed to be true appears not to be the case. Take the following data set as an example: df - data.frame(x = runif(100, 0, 100)) df$y - df$x + 1 + rnorm(100, sd=15) I had expected that: summary(lm(y ~ x, data=df, weights=rep(2, 100))) summary(lm(y ~ x, data=rbind(df,df))) would be equivalent, but they are not. I suspect the difference is how the degrees of freedom is calculated - I had expected it to be sum(weights), but seems to be sum(weights 0). This seems unintuitive to me: summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50))) summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50))) What am I missing? And what is the usual way to do a linear regression when you have aggregated data? Thanks, Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Representing a statistic as a colour on a 2d plot
mister_bluesman wrote: Hello. I have a 2d plot which looks like this: http://www.nabble.com/file/8242/1.JPG This plot is derived from a file that holds statistics about each point on the plot and looks like this: abc d e a00.4980.4730.524 0.528 b 0.498 0 0 0 0 c 0.473 0 0 0 0 d 0.524 0 0 0 0 e 0.528 0 0 0 0 However, I have another file called 2.txt, with the following contents: a b c d e 0.5 0.7 0.32 0.34 0.01 What I would like to know is how do I convert these values in 2.txt to colours or colour intensities so that the x's in the diagram above can be colour coded as such. Yo bluesman, check color.scale in the plotrix package, cat it'll color your points to the values they're at Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How do i calculate emp Returns
Hello, I want to build empirical Returns of stockfonds, but I cant find a function in R, which calculate this. Alin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a new column
Hallo, I just tried, if my solution might be possible for you. Here is the result: s=2 while(s!=0){ n=20 + m-matrix(nrow=n,ncol=4) + colnames(m)=c(treatmentgrp,strata,censoringTime,survivalTime) + for(i in 1:20) m[i,]-c(sample(c(1,2),1,replace=TRUE),sample(c(1,2),1,replace=TRUE),rexp(1,.007),rexp(1,.002)) + m-cbind(m,0) + m[m[,3]m[,4],5]-1 + colnames(m)[5]-censoring + print(m) +s=s-1 + } #bilding a data frame from m x=data.frame(m) # adding a column, where nrow(x) = number of row in x and in the # new coulmn should stand the entries 1...20. x=data.frame(x,actual surv time=c(1:nrow(x))) x treatmentgrp strata censoringTime survivalTime censoring actual.surv.time 1 1 2377.486125 1070.66287 01 2 1 2242.468604 1061.30474 02 3 1 2 40.904656 51.88263 03 4 2 2 44.025595590.15317 04 5 1 1253.093279247.32141 15 6 2 2 50.486272257.25016 06 7 1 1337.591250554.05931 07 8 2 2 74.905075873.14563 08 9 1 2 57.196581765.43142 09 101 2370.147307 1646.65368 0 10 111 2152.738532480.12355 0 11 122 2 15.313303139.19791 0 12 131 2 17.205624641.15764 0 13 142 1 81.753924107.02202 0 14 151 2 60.774221665.27500 0 15 162 1 8.712562142.90775 0 16 171 1 54.542722 1904.88060 0 17 182 2 85.626140214.66811 0 18 192 1 31.257923739.96591 0 19 201 1 85.910141306.14860 0 20 See it works! For more information of data frames, see ?data.frame Here are some additional examples. x[2,6] # Extraction of special entries: x[2,6]= 2 - row 2, column 6 [1] 2 x[2,6]=100 # Changing entires: x[2,6]=100 x treatmentgrp strata censoringTime survivalTime censoring actual.surv.time 1 1 2377.486125 1070.66287 01 2 1 2242.468604 1061.30474 0 100 3 1 2 40.904656 51.88263 03 4 2 2 44.025595590.15317 04 5 1 1253.093279247.32141 15 6 2 2 50.486272257.25016 06 7 1 1337.591250554.05931 07 8 2 2 74.905075873.14563 08 9 1 2 57.196581765.43142 09 101 2370.147307 1646.65368 0 10 111 2152.738532480.12355 0 11 122 2 15.313303139.19791 0 12 131 2 17.205624641.15764 0 13 142 1 81.753924107.02202 0 14 151 2 60.774221665.27500 0 15 162 1 8.712562142.90775 0 16 171 1 54.542722 1904.88060 0 17 182 2 85.626140214.66811 0 18 192 1 31.257923739.96591 0 19 201 1 85.910141306.14860 0 20 t=x row2=t[-2,] # deleting a row row2 treatmentgrp strata censoringTime survivalTime censoring actual.surv.time 1 1 2377.486125 1070.66287 01 3 1 2 40.904656 51.88263 03 4 2 2 44.025595590.15317 04 5 1 1253.093279247.32141 15 6 2 2 50.486272257.25016 06 7 1 1337.591250554.05931 07 8 2 2 74.905075873.14563 08 9 1 2 57.196581765.43142 09 101 2370.147307 1646.65368 0 10 111 2152.738532
[R] trouble with help
Hallo, I just updated to the new version of R by installing everything new. Now I have a problem with the help command: help.start() updating HTML package listing updating HTML search index If nothing happens, you should open 'Z:\Software\R\R-2.5.0\doc\html\index.html' yourself The browser was started but nothing was displayes. Can anyone help me, Corinna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Piecewise cubic Hermite interpolation
Which function implements the piecewise cubic Hermite interpolation? I am looking for equivalent of matlab's interp1 with the method = 'pchip' Here is the reference http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/interp1.html; -- View this message in context: http://www.nabble.com/Piecewise-cubic-Hermite-interpolation-tf3708765.html#a10373214 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sample() and memory usage
As a part of a simulation, I need to sample from a large vector repeatedly. For some reason sample() builds up the memory usage ( 500 MB for this example) when used inside a for loop as illustrated here: X - 1:10 P - runif(10) for(i in 1:500) Xsamp - sample(X,3,replace=TRUE,prob=P) Even worse, I am not able to free up memory without quitting R. I quickly run out of memory when trying to perform the simulation. Is there any way to avoid this to happen? The problem seem to appear only when specifying replace=TRUE and probability weights for the vector being sampled, and this happens both on Windows XP and Linux (Ubuntu). Victor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a new column
Hallo, if you work with matrix you can use cbind to add columns. Corinna -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von raymond chiruka Gesendet: Montag, 7. Mai 2007 16:28 An: r Betreff: [R] creating a new column hie l would like to create a 6th column actual surv time from the following data the condition being if censoringTimesurvivaltime then actual survtime =survival time else actual survtime =censoring time the code l used to create the data is s=2 while(s!=0){ n=20 m-matrix(nrow=n,ncol=4) colnames(m)=c(treatmentgrp,strata,censoringTime,survivalTime) for(i in 1:20) m[i,]-c(sample(c(1,2),1,replace=TRUE),sample(c(1,2),1,replace=TRUE),rexp(1,.007),rexp(1,.002)) m-cbind(m,0) m[m[,3]m[,4],5]-1 colnames(m)[5]-censoring print(m) s=s-1 treatmentgrp strata censoringTime survivalTime censoring [1,] 1 1 1.0121591137.80922 0 [2,] 2 2 32.971439 247.21786 0 [3,] 2 1 85.758253 797.04949 0 [4,] 1 1 16.999171 78.92309 0 [5,] 2 1 272.909896 298.21483 0 [6,] 1 2 138.230629 935.96765 0 [7,] 2 2 91.529859 141.08405 0 l keep getting an error message when i try to create the 6th column - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble with help
Did you try manually opening the file mentioned in the browser (Z:\Software\R\R-2.5.0\doc\html\index.html) ? Does this file exist? Was the html help installed? Schmitt, Corinna wrote: Hallo, I just updated to the new version of R by installing everything new. Now I have a problem with the help command: help.start() updating HTML package listing updating HTML search index If nothing happens, you should open 'Z:\Software\R\R-2.5.0\doc\html\index.html' yourself The browser was started but nothing was displayes. Can anyone help me, Corinna -- View this message in context: http://www.nabble.com/trouble-with-help-tf3708685.html#a10373271 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad optimization solution
The issue is that you are using a derivative based optimizer for a problem for which it is well known that such optimizers will not perform well. You should consider using a global optimizer. For example, rgenoud combines a genetic search algorithm with a BFGS optimizer and it works well for your problem: library(rgenoud) myfunc - function(x) { x1 - x[1] x2 - x[2] abs(x1-x2) } optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=list(fnscale=-1)) genoud(myfunc, nvars=2, Domains=rbind(c(0,1),c(0,1)),max=TRUE,boundary.enforcement=2) myfunc - function(x) { x1 - x[1] x2 - x[2] (x1-x2)^2 } optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=list(fnscale=-1)) genoud(myfunc, nvars=2, Domains=rbind(c(0,1),c(0,1)),max=TRUE,boundary.enforcement=2) Cheers, Jas. === Jasjeet S. Sekhon Associate Professor Travers Department of Political Science Survey Research Center UC Berkeley http://sekhon.berkeley.edu/ V: 510-642-9974 F: 617-507-5524 === Paul Smith writes: It seems that there is here a problem of reliability, as one never knows whether the solution provided by R is correct or not. In the case that I reported, it is fairly simple to see that the solution provided by R (without any warning!) is incorrect, but, in general, that is not so simple and one may take a wrong solution as a correct one. Paul On 5/8/07, Ravi Varadhan [EMAIL PROTECTED] wrote: Your function, (x1-x2)^2, has zero gradient at all the starting values such that x1 = x2, which means that the gradient-based search methods will terminate there because they have found a critical point, i.e. a point at which the gradient is zero (which can be a maximum or a minimum or a saddle point). However, I do not why optim converges to the boundary maximum, when analytic gradient is supplied (as shown by Sundar). Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith Sent: Monday, May 07, 2007 6:26 PM To: R-help Subject: Re: [R] Bad optimization solution On 5/7/07, Paul Smith [EMAIL PROTECTED] wrote: I think the problem is the starting point. I do not remember the details of the BFGS method, but I am almost sure the (.5, .5) starting point is suspect, since the abs function is not differentiable at 0. If you perturb the starting point even slightly you will have no problem. Paul Smith [EMAIL PROTECTED] To Sent by: R-help r-help@stat.math.ethz.ch [EMAIL PROTECTED] cc at.math.ethz.ch Subject [R] Bad optimization solution 05/07/2007 04:30 PM Dear All I am trying to perform the below optimization problem, but getting (0.5,0.5) as optimal solution, which is wrong; the correct solution should be (1,0) or (0,1). Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux). Thanks in advance, Paul -- myfunc - function(x) { x1 - x[1] x2 - x[2] abs(x1-x2) } optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control= list(fnscale=-1)) Yes, with (0.2,0.9), a correct solution comes out. However, how can one be sure in general that the solution obtained by optim is correct? In ?optim says: Method 'L-BFGS-B' is that of Byrd _et. al._ (1995) which allows _box constraints_, that is each variable can be given a lower and/or upper bound. The initial value must satisfy the constraints. This uses a limited-memory modification of the BFGS quasi-Newton method. If non-trivial bounds are supplied, this method will be selected, with a warning. which only demands that the initial value must satisfy the constraints. Furthermore, X^2 is everywhere differentiable and notwithstanding the reported problem occurs with
[R] sample() and memory usage
As a part of a simulation, I need to sample from a large vector repeatedly. For some reason sample() builds up the memory usage ( 500 MB for this example) when used inside a for loop as illustrated here: X - 1:10 P - runif(10) for(i in 1:500) Xsamp - sample(X,3,replace=TRUE,prob=P) Even worse, I am not able to free up memory without quitting R. I quickly run out of memory when trying to perform the simulation. Is there any way to avoid this to happen? The problem seem to appear only when specifying replace=TRUE and probability weights for the vector being sampled, and this happens both on Windows XP and Linux (Ubuntu). Victor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble with help
How can I install the html help? Corinna .de http://www.igb.fraunhofer.de -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Vladimir Eremeev Gesendet: Dienstag, 8. Mai 2007 12:22 An: r-help@stat.math.ethz.ch Betreff: Re: [R] trouble with help Did you try manually opening the file mentioned in the browser (Z:\Software\R\R-2.5.0\doc\html\index.html) ? Does this file exist? Was the html help installed? Schmitt, Corinna wrote: Hallo, I just updated to the new version of R by installing everything new. Now I have a problem with the help command: help.start() updating HTML package listing updating HTML search index If nothing happens, you should open 'Z:\Software\R\R-2.5.0\doc\html\index.html' yourself The browser was started but nothing was displayes. Can anyone help me, Corinna -- View this message in context: http://www.nabble.com/trouble-with-help-tf3708685.html#a10373271 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble with help
You should select it in the list of components to install, when the installer asks. It happens after you 3rd time press Next in the installation wizard window. Choose Custom installation and select all you need. Schmitt, Corinna wrote: How can I install the html help? Corinna -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Vladimir Eremeev Gesendet: Dienstag, 8. Mai 2007 12:22 An: r-help@stat.math.ethz.ch Betreff: Re: [R] trouble with help Did you try manually opening the file mentioned in the browser (Z:\Software\R\R-2.5.0\doc\html\index.html) ? Does this file exist? Was the html help installed? -- View this message in context: http://www.nabble.com/trouble-with-help-tf3708685.html#a10374007 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble with help
Hallo, I found the solution. I installed everything new and in the setup I choose Changing Startoptions and there I could mark html help. It now works, Corinna -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Schmitt, Corinna Gesendet: Dienstag, 8. Mai 2007 13:03 An: Vladimir Eremeev; r Betreff: Re: [R] trouble with help How can I install the html help? Corinna .de http://www.igb.fraunhofer.de -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Vladimir Eremeev Gesendet: Dienstag, 8. Mai 2007 12:22 An: r-help@stat.math.ethz.ch Betreff: Re: [R] trouble with help Did you try manually opening the file mentioned in the browser (Z:\Software\R\R-2.5.0\doc\html\index.html) ? Does this file exist? Was the html help installed? Schmitt, Corinna wrote: Hallo, I just updated to the new version of R by installing everything new. Now I have a problem with the help command: help.start() updating HTML package listing updating HTML search index If nothing happens, you should open 'Z:\Software\R\R-2.5.0\doc\html\index.html' yourself The browser was started but nothing was displayes. Can anyone help me, Corinna -- View this message in context: http://www.nabble.com/trouble-with-help-tf3708685.html#a10373271 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Neural Nets (nnet) - evaluating success rate of predictions
Nathaniel, On Mon, 7 May 2007, [EMAIL PROTECTED] wrote: Date: Sun, 6 May 2007 12:02:31 + (GMT) From: nathaniel Grey [EMAIL PROTECTED] However what I really want to know is how well my nueral net is doing at classifying my binary output variable. I am new to R and I can't figure out how you can assess the success rates of predictions. I've been recently tacking this myself, though with respect to polytomous (2) outcomes. The following approaches are based on Menard (1995), Cohen et al. (2002) and Manning Schütze (1999). First you have to decide what is the critical probability that you use to classify the cases into class A (and consequently not(class[A])). The simplest level is 0.5, but other levels might also be motivated, see e.g. Cohen et al. (2002: 516-519). You can then treat the classification task as two distinct types, namely classification and prediction models, which have an effect on how the efficiency and accuracy of prediction is exactly measured (Menard 1995: 24-26). In a pure prediction model, we set no a priori expectation or constraint on the overall frequencies of the predicted classes. To the contrary, in a classification model our expectation is that the predicted outcome classes on the long run will end up having the same proportions as are evident in the training data. As the starting point for evaluating prediction efficiency is to compile a 2x2 prediction/classification table. Frequency counts on the (decending) diagonal in the table indicate correctly predicted and classified cases, whereas all counts off the diagonal are incorrect. For the two alternatives overerall, we can divide the predicted classifications into the four types presented below, on which the basic measures of prediction efficiency are based. (Manning and Schütze 1999: 267-271) Original/Predicted Class not(Class)(=Other) Class TP ~ True Positive) FN ~ False Negative not(Class) (=Other) FP ~ False Positive TN ~ True Negative You can then go on to calculate recall and precision, or spesificity or sensitivity. Recall is the proportion of original occurrences of some particular class for which the prediction is correct (formula 1 below, see Manning and Schütze 1999: 269, formula 8.4), whereas precision is the proportion of the all the predictions of some particular class, which turn out to be correct (formula 2 below, see Manning and Schütze 1999: 268, formula 8.3). Sensitivity is in fact exactly equal to recall, whereas specificity is understood as the proportion of non-cases correctly predicted or classified as non-cases, i.e. rejected (formula 3 below) Furthermore, there is a third pair of evaluation measures that one could also calculate, namely accuracy and error (formula 4 below) (Manning and Schütze 1999: 268-270). (1) Recall = TP / (TP + FN) (=Sensitivity) (2) Precision = TP / (TP + FP) (3) Specificity = TN / (TN + FN) (4) Accuracy = (TP + TN) / N = diag(n[k,k]) However, as has been noted in some earlier responses these aforementioned general measures do not in any way take into consideration whether prediction and classification according to a model, with the help of explanatory variables, performs any better than knowing the overall proportions of the outcome classes. For this purpose, the asymmetric summary measures of association based on Proportionate Reduction of Error (PRE) are good candidates for evaluating prediction accuracy, where we expect that the prediction or classification process on the basis of the models should exceed some baselines or thresholds. However, one cannot use the Goodman-Kruskal lambda and tau as such, but make some adjustments to account for the possibility of incorrect prediction. With this approach one compares prediction/classification errors with the model, error(model), to the baseline level of prediction/classification errors without the error(model, baseline), according to formula 10 below. (Menard 1995: 28-30). The formula for the error with the model remains the same, irrespective of whether we are evaluating prediction or classification accuracy, presented in (5), but the errors without the model vary according to the intended objective, presented in (6) and (7). Subsequently, the measure for the proportionate reduction of prediction error is presented in (9) below, and being analogous to the Goodman-Kruskal lambda it is designated as lambda(prediction). Similarly, the measure for proportionate reduction of classification error is presented in (10), and being analogous with the Goodman-Kruskal tau it is likewise designated as tau(classification). For both measures, positive values indicate better than baseline classification, while negative values worse performance. (5) error(model) = N - SUM{k=1...K}n[k,k] = N - SUM{diag(n)], where n is the 2x2 prediction/classification matrix (6) error(baseline, prediction) = N - max(R[k]), with R[k] = marginal row sums for each row k of altogether K
[R] minimum from matrix
I have a very large matrix with columns that have some of their entries as zero A small example if a= [,1] [,2] [,3] [,4] [,1] 0 2 0 0 [,2] 1 3 0 3 [,3] 2 0 3 5 [,4] 0 4 0 0 and what to get the minimum number from each column but that number should not be zero. If I use apply (a,2,min) I will get a vector of zeros as the minimum but what I want it for example from column 1 I should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I wonder if someone can give an idea on how to go about it. thanks in advance for your help. Oarabile __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum from matrix
apply(a, 2, function(x) min(x[x!=0]) ) should do it. Might need some improvement if all numbers in a column can be zero, try it. Gabor On Tue, May 08, 2007 at 09:50:43AM +0100, [EMAIL PROTECTED] wrote: I have a very large matrix with columns that have some of their entries as zero A small example if a= [,1] [,2] [,3] [,4] [,1] 0 2 0 0 [,2] 1 3 0 3 [,3] 2 0 3 5 [,4] 0 4 0 0 and what to get the minimum number from each column but that number should not be zero. If I use apply (a,2,min) I will get a vector of zeros as the minimum but what I want it for example from column 1 I should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I wonder if someone can give an idea on how to go about it. thanks in advance for your help. Oarabile __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] irtoys
I have just submitted irtoys_0.1.0, a package potentially useful for those working with IRT models. It can fit the 1PL, 2PL, and 3PL models through a simple and unified syntax, using either the R package ltm, Brad Hanson's ICL program, or the commercially available BILOG-MG. The purpose is basically to facilitate teaching, and especially comparisons across models and/or programs. Various graphs, item fit statistics, scaling methods, ability estimation, simulation facilities etc. are also included. Please notice that most options in estimating an IRT model are kept to the typical default values, and only a small, common subset of the ltm, ICL, and BILOG syntax is supported. Comments are very welcome. Ivailo Partchev Institute of Psychology University of Jena Germany ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum from matrix
try this: apply(a, 2, function(x) min(x[x 0])) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Tuesday, May 08, 2007 10:50 AM Subject: [R] minimum from matrix I have a very large matrix with columns that have some of their entries as zero A small example if a= [,1] [,2] [,3] [,4] [,1] 0 2 0 0 [,2] 1 3 0 3 [,3] 2 0 3 5 [,4] 0 4 0 0 and what to get the minimum number from each column but that number should not be zero. If I use apply (a,2,min) I will get a vector of zeros as the minimum but what I want it for example from column 1 I should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I wonder if someone can give an idea on how to go about it. thanks in advance for your help. Oarabile __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weighted least squares
See below. hadley wickham wrote: Dear all, I'm struggling with weighted least squares, where something that I had assumed to be true appears not to be the case. Take the following data set as an example: df - data.frame(x = runif(100, 0, 100)) df$y - df$x + 1 + rnorm(100, sd=15) I had expected that: summary(lm(y ~ x, data=df, weights=rep(2, 100))) summary(lm(y ~ x, data=rbind(df,df))) You assign weights to different points according to some external quality or reliability measure not number of times the data point was measured. Look at the estimates and standard error of the two models below: coefficients( summary(f.w - lm(y ~ x, data=df, weights=rep(2, 100))) ) Estimate Std. Error t value Pr(|t|) (Intercept) 1.940765 3.30348066 0.587491 5.582252e-01 x 0.982610 0.05893262 16.673448 2.264258e-30 coefficients( summary( f.u - lm(y ~ x, data=rbind(df,df) ) ) ) Estimate Std. Errort value Pr(|t|) (Intercept) 1.940765 2.32408609 0.8350659 4.046871e-01 x 0.982610 0.04146066 23.6998165 1.012067e-59 You can see that they have same coefficient estimates but the second one has smaller variances. The repeated values artificially deflates the variance and thus inflates the precision. This is why you cannot treat replicate data as independent observations. would be equivalent, but they are not. I suspect the difference is how the degrees of freedom is calculated - I had expected it to be sum(weights), but seems to be sum(weights 0). This seems unintuitive to me: summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50))) summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50))) What am I missing? And what is the usual way to do a linear regression when you have aggregated data? I would be best to use the individual data points instead of aggregated data as it allows you to estimate the within-group variations as well. If you had individual data points, you could try something as follows. Please check the codes as I am no expert in the area of repeated measures. x - runif(100, 0, 100) y1 - x + rnorm(100, mean=1, sd=15) y2 - y1 + rnorm(100, sd=5) df - data.frame( y=c(y1, y2), x=c(x,x), subject=factor(rep( paste(p, 1:100, sep=), 2 ) )) library(nlme) summary( lme( y ~ x, random = ~ 1 | subject, data=df ) ) Try reading Pinheiro and Bates (http://tinyurl.com/yvvrr7) or related material for more information. Hope this helps. Thanks, Hadley Regards, Adai __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sample function and memory usage
As a part of a simulation, I need to sample from a large vector repeatedly. For some reason sample() builds up the memory usage ( 500 MB for this example) when used inside a for loop as illustrated here: X - 1:10 P - runif(10) for(i in 1:500) Xsamp - sample(X,3,replace=TRUE,prob=P) Even worse, I am not able to free up memory without quitting R. I quickly run out of memory when trying to perform the simulation. Is there any way to avoid this to happen? The problem seem to appear only when specifying both replace=TRUE and probability weights for the vector being sampled, and this happens both on Windows XP and Linux (Ubuntu). Victor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weighted least squares
On 5/8/07, Adaikalavan Ramasamy [EMAIL PROTECTED] wrote: See below. hadley wickham wrote: Dear all, I'm struggling with weighted least squares, where something that I had assumed to be true appears not to be the case. Take the following data set as an example: df - data.frame(x = runif(100, 0, 100)) df$y - df$x + 1 + rnorm(100, sd=15) I had expected that: summary(lm(y ~ x, data=df, weights=rep(2, 100))) summary(lm(y ~ x, data=rbind(df,df))) You assign weights to different points according to some external quality or reliability measure not number of times the data point was measured. That is one type of weighting - but what if I have already aggregated data? That is a perfectly valid type of weighting too. Look at the estimates and standard error of the two models below: coefficients( summary(f.w - lm(y ~ x, data=df, weights=rep(2, 100))) ) Estimate Std. Error t value Pr(|t|) (Intercept) 1.940765 3.30348066 0.587491 5.582252e-01 x 0.982610 0.05893262 16.673448 2.264258e-30 coefficients( summary( f.u - lm(y ~ x, data=rbind(df,df) ) ) ) Estimate Std. Errort value Pr(|t|) (Intercept) 1.940765 2.32408609 0.8350659 4.046871e-01 x 0.982610 0.04146066 23.6998165 1.012067e-59 You can see that they have same coefficient estimates but the second one has smaller variances. The repeated values artificially deflates the variance and thus inflates the precision. This is why you cannot treat replicate data as independent observations. Hardly artificially - I have repeated observations. would be equivalent, but they are not. I suspect the difference is how the degrees of freedom is calculated - I had expected it to be sum(weights), but seems to be sum(weights 0). This seems unintuitive to me: summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50))) summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50))) What am I missing? And what is the usual way to do a linear regression when you have aggregated data? I would be best to use the individual data points instead of aggregated data as it allows you to estimate the within-group variations as well. There is no within group variation - these are observations that occur with same values many times in the dataset, so have been aggregated into the a contingency table-like format. If you had individual data points, you could try something as follows. Please check the codes as I am no expert in the area of repeated measures. x - runif(100, 0, 100) y1 - x + rnorm(100, mean=1, sd=15) y2 - y1 + rnorm(100, sd=5) df - data.frame( y=c(y1, y2), x=c(x,x), subject=factor(rep( paste(p, 1:100, sep=), 2 ) )) library(nlme) summary( lme( y ~ x, random = ~ 1 | subject, data=df ) ) Try reading Pinheiro and Bates (http://tinyurl.com/yvvrr7) or related material for more information. Hope this helps. I'm not interested in a mixed model, and I don't have individual data points. Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum from matrix
Hallo, I added one row: a=rbind(a,1:4) a [,1] [,2] [,3] [,4] [1,]0120 [2,]2304 [3,]0030 [4,]0350 [5,]1234 And how looks like the command for the minimum of the rows? The result should be minOfRows = 0 0 0 0 1 Thanks, Corinna -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Gabor Csardi Gesendet: Dienstag, 8. Mai 2007 14:03 An: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Betreff: Re: [R] minimum from matrix apply(a, 2, function(x) min(x[x!=0]) ) should do it. Might need some improvement if all numbers in a column can be zero, try it. Gabor On Tue, May 08, 2007 at 09:50:43AM +0100, [EMAIL PROTECTED] wrote: I have a very large matrix with columns that have some of their entries as zero A small example if a= [,1] [,2] [,3] [,4] [,1] 0 2 0 0 [,2] 1 3 0 3 [,3] 2 0 3 5 [,4] 0 4 0 0 and what to get the minimum number from each column but that number should not be zero. If I use apply (a,2,min) I will get a vector of zeros as the minimum but what I want it for example from column 1 I should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I wonder if someone can give an idea on how to go about it. thanks in advance for your help. Oarabile __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum from matrix
Corinna, what is going on here? I've answered Oarabile's question, and then you reply to me with this. I'm completely lost. 1. What is your question? Minimum of every row? This was written in the original mail along with Oarabile's question. Ok it wasn't rows but columns. For rows it is apply(a, 1, min) 2. Why did you reply to my message? 3. And if you did so, why didn't you read the original message you're replying to? 4. Why did you add one row to the original matrix? Why is the new matrix significantly different form the original one? Gabor On Tue, May 08, 2007 at 02:49:50PM +0200, Schmitt, Corinna wrote: Hallo, I added one row: a=rbind(a,1:4) a [,1] [,2] [,3] [,4] [1,]0120 [2,]2304 [3,]0030 [4,]0350 [5,]1234 And how looks like the command for the minimum of the rows? The result should be minOfRows = 0 0 0 0 1 Thanks, Corinna -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Gabor Csardi Gesendet: Dienstag, 8. Mai 2007 14:03 An: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Betreff: Re: [R] minimum from matrix apply(a, 2, function(x) min(x[x!=0]) ) should do it. Might need some improvement if all numbers in a column can be zero, try it. Gabor On Tue, May 08, 2007 at 09:50:43AM +0100, [EMAIL PROTECTED] wrote: I have a very large matrix with columns that have some of their entries as zero A small example if a= [,1] [,2] [,3] [,4] [,1] 0 2 0 0 [,2] 1 3 0 3 [,3] 2 0 3 5 [,4] 0 4 0 0 and what to get the minimum number from each column but that number should not be zero. If I use apply (a,2,min) I will get a vector of zeros as the minimum but what I want it for example from column 1 I should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I wonder if someone can give an idea on how to go about it. thanks in advance for your help. Oarabile __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A function for raising a matrix to a power?
Ravi Varadhan wrote: Paul, Your solution based on SVD does not work Ooops. I really am getting rusty. The idea is based on eigenvalues which, of course, are not always the same as singular values. Paul even for the matrix in your example (the reason it worked for e=3, was that it is an odd power and since P is a permutation matrix. It will also work for all odd powers, but not for even powers). However, a spectral decomposition will work for all symmetric matrices (SVD based exponentiation doesn't work for any matrix). Here is the function to do exponentiation based on spectral decomposition: expM.sd - function(X,e){Xsd - eigen(X); Xsd$vec %*% diag(Xsd$val^e) %*% t(Xsd$vec)} P - matrix(c(.3,.7, .7, .3), ncol=2) P [,1] [,2] [1,] 0.3 0.7 [2,] 0.7 0.3 P %*% P [,1] [,2] [1,] 0.58 0.42 [2,] 0.42 0.58 expM(P,2) [,1] [,2] [1,] 0.42 0.58 [2,] 0.58 0.42 expM.sd(P,2) [,1] [,2] [1,] 0.58 0.42 [2,] 0.42 0.58 Exponentiation based on spectral decomposition should be generally more accurate (not sure about the speed). A caveat is that it will only work for real, symmetric or complex, hermitian matrices. It will not work for asymmetric matrices. It would work for asymmetric matrix if the eigenvectors are made to be orthonormal. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Gilbert Sent: Monday, May 07, 2007 5:16 PM To: Atte Tenkanen Cc: r-help@stat.math.ethz.ch Subject: Re: [R] A function for raising a matrix to a power? You might try this, from 9/22/2006 with subject line Exponentiate a matrix: I am getting a bit rusty on some of these things, but I seem to recall that there is a numerical advantage (speed and/or accuracy?) to diagonalizing: expM - function(X,e) { v - La.svd(X); v$u %*% diag(v$d^e) %*% v$vt } P - matrix(c(.3,.7, .7, .3), ncol=2) P %*% P %*% P [,1] [,2] [1,] 0.468 0.532 [2,] 0.532 0.468 expM(P,3) [,1] [,2] [1,] 0.468 0.532 [2,] 0.532 0.468 I think this also works for non-integer, negative, large, and complex exponents: expM(P, 1.5) [,1] [,2] [1,] 0.3735089 0.6264911 [2,] 0.6264911 0.3735089 expM(P, 1000) [,1] [,2] [1,] 0.5 0.5 [2,] 0.5 0.5 expM(P, -3) [,1][,2] [1,] -7.3125 8.3125 [2,] 8.3125 -7.3125 expM(P, 3+.5i) [,1] [,2] [1,] 0.4713+0.0141531i 0.5287-0.0141531i [2,] 0.5287-0.0141531i 0.4713+0.0141531i Paul Gilbert Doran, Harold wrote: Suppose I have a square matrix P P - matrix(c(.3,.7, .7, .3), ncol=2) I know that P * P Returns the element by element product, whereas P%*%P Returns the matrix product. Now, P2 also returns the element by element product. But, is there a slick way to write P %*% P %*% P Obviously, P3 does not return the result I expect. Thanks, Harold Atte Tenkanen wrote: Hi, Is there a function for raising a matrix to a power? For example if you like to compute A%*%A%*%A, is there an abbreviation similar to A^3? Atte Tenkanen A=rbind(c(1,1),c(-1,-2)) A [,1] [,2] [1,]11 [2,] -1 -2 A^3 [,1] [,2] [1,]11 [2,] -1 -8 But: A%*%A%*%A [,1] [,2] [1,]12 [2,] -2 -5 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. La version française suit le texte anglais. This email may contain privileged and/or confidential inform...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. La version française suit le texte anglais. This email may contain
[R] minimum of each row in a matrix
Hallo, I just followed the discussion of the title: minimum from matrix. I have a similar problem. a=matrix(c(0,2, 0, 0, 1, 3, 0, 3, 2, 0, 3, 5, 0, 4, 0, 0),ncol=4) a=rbind(a,1:4) a [,1] [,2] [,3] [,4] [1,]0120 [2,]2304 [3,]0030 [4,]0350 [5,]1234 minOfColumns=apply(a, 2, function(x) min(x[x!=0]) ) minOfColumns [1] 1 1 2 4 maxOfColumns=apply(a, 2, function(x) max(x) ) maxOfColumns [1] 2 3 5 4 How looks like the command for the minimum of the rows? I tried several possibilities with apply but did not get the wanted result. The result should be minOfRows = 0 0 0 0 1 Thanks, Corinna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum of each row in a matrix
Check out the help for apply, particularly the MARGIN argument: minOfRows=apply(a, 1, function(x) min(x[x!=0]) ) maxOfRows=apply(a, 1, function(x) max(x) ) Sarah On 5/8/07, Schmitt, Corinna [EMAIL PROTECTED] wrote: Hallo, I just followed the discussion of the title: minimum from matrix. I have a similar problem. a=matrix(c(0,2, 0, 0, 1, 3, 0, 3, 2, 0, 3, 5, 0, 4, 0, 0),ncol=4) a=rbind(a,1:4) a [,1] [,2] [,3] [,4] [1,]0120 [2,]2304 [3,]0030 [4,]0350 [5,]1234 minOfColumns=apply(a, 2, function(x) min(x[x!=0]) ) minOfColumns [1] 1 1 2 4 maxOfColumns=apply(a, 2, function(x) max(x) ) maxOfColumns [1] 2 3 5 4 How looks like the command for the minimum of the rows? I tried several possibilities with apply but did not get the wanted result. The result should be minOfRows = 0 0 0 0 1 Thanks, Corinna -- Sarah Goslee http://www.functionaldiversity.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum of each row in a matrix
Looks like you are reading manuals and these mailings insufficiently carefully. ?apply says that if its second argument is 1, it gives you what you want. Gabor Csardi has also written you this. If you have several vectors, not a single matrix, you can use pmin: pmin(a[1,],a[2,],a[3,],a[4,],a[5,]) However, this variant is inefficient in this particular case (lots of typing, and hardcoded). Schmitt, Corinna wrote: Hallo, I just followed the discussion of the title: minimum from matrix. I have a similar problem. a=matrix(c(0,2, 0, 0, 1, 3, 0, 3, 2, 0, 3, 5, 0, 4, 0, 0),ncol=4) a=rbind(a,1:4) a [,1] [,2] [,3] [,4] [1,]0120 [2,]2304 [3,]0030 [4,]0350 [5,]1234 minOfColumns=apply(a, 2, function(x) min(x[x!=0]) ) minOfColumns [1] 1 1 2 4 maxOfColumns=apply(a, 2, function(x) max(x) ) maxOfColumns [1] 2 3 5 4 How looks like the command for the minimum of the rows? I tried several possibilities with apply but did not get the wanted result. The result should be minOfRows = 0 0 0 0 1 -- View this message in context: http://www.nabble.com/minimum-of-each-row-in-a-matrix-tf3709616.html#a10375966 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum of each row in a matrix
Corinna, what about reading my previous mail, which you suggested to forget? Gabor ps. it is apply(a, 1, min) On Tue, May 08, 2007 at 03:21:26PM +0200, Schmitt, Corinna wrote: Hallo, I just followed the discussion of the title: minimum from matrix. I have a similar problem. a=matrix(c(0,2, 0, 0, 1, 3, 0, 3, 2, 0, 3, 5, 0, 4, 0, 0),ncol=4) a=rbind(a,1:4) a [,1] [,2] [,3] [,4] [1,]0120 [2,]2304 [3,]0030 [4,]0350 [5,]1234 minOfColumns=apply(a, 2, function(x) min(x[x!=0]) ) minOfColumns [1] 1 1 2 4 maxOfColumns=apply(a, 2, function(x) max(x) ) maxOfColumns [1] 2 3 5 4 How looks like the command for the minimum of the rows? I tried several possibilities with apply but did not get the wanted result. The result should be minOfRows = 0 0 0 0 1 Thanks, Corinna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plots - scatterplots - find index of a point in a list
Hi, is it possible to find the index of a point of a plot (e.g. scatterplot) in an easy way? Eg. x - c(1:5); y - c(1:5); plot(x, y); On the plot if I move my cursor on top of a point or click on it is it possible to have its index printed or its exact value? Any clues? Thanks. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum of each row in a matrix
Hi Peter, thanks for the quick help. Yes it does what I want. Thanks, Corinna Von: Peter Konings [mailto:[EMAIL PROTECTED] Gesendet: Dienstag, 8. Mai 2007 15:29 An: Schmitt, Corinna Betreff: Re: [R] minimum of each row in a matrix Hi Corinna, does apply(a, 1, min) do what you want? HTH Peter. On 5/8/07, Schmitt, Corinna [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hallo, I just followed the discussion of the title: minimum from matrix. I have a similar problem. a=matrix(c(0,2, 0, 0, 1, 3, 0, 3, 2, 0, 3, 5, 0, 4, 0, 0),ncol=4) a=rbind(a,1:4) a [,1] [,2] [,3] [,4] [1,]0120 [2,]2304 [3,]0030 [4,]0350 [5,]1234 minOfColumns=apply(a, 2, function(x) min(x[x!=0]) ) minOfColumns [1] 1 1 2 4 maxOfColumns=apply(a, 2, function(x) max(x) ) maxOfColumns [1] 2 3 5 4 How looks like the command for the minimum of the rows? I tried several possibilities with apply but did not get the wanted result. The result should be minOfRows = 0 0 0 0 1 Thanks, Corinna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot time series
__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plots - scatterplots - find index of a point in a list
Try ?locator --- [EMAIL PROTECTED] wrote: Hi, is it possible to find the index of a point of a plot (e.g. scatterplot) in an easy way? Eg. x - c(1:5); y - c(1:5); plot(x, y); On the plot if I move my cursor on top of a point or click on it is it possible to have its index printed or its exact value? Any clues? Thanks. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot time series
Hi Jessica [EMAIL PROTECTED] wrote: __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. your email is not really very informative. Did you read the posting guide? As suggested there, did you read the relevant section in 'An Introduction to R' (hint: did you look at section 12.1.1?)? If you have still problems after these steps, it might be useful to provide a commented, minimal, self-contained, reproducible code example. Best, Roland __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weighted least squares
Sorry, you did not explain that your weights correspond to your frequency in the original post. I assumed they were repeated measurements with within group variation. I was merely responding to your query why the following differed. summary(lm(y ~ x, data=df, weights=rep(2, 100))) summary(lm(y ~ x, data=rbind(df,df))) Let me also clarify my statement about artificial. If one treats repeated observations as independent, then they obtain estimates with inflated precision. I was not calling your data artificial in any way. Using frequency as weights may be valid. Your data points appear to arise from discrete distribution, so I am not entirely sure if you can use the linear model which assumes the errors are normally distributed. Regards, Adai hadley wickham wrote: On 5/8/07, Adaikalavan Ramasamy [EMAIL PROTECTED] wrote: See below. hadley wickham wrote: Dear all, I'm struggling with weighted least squares, where something that I had assumed to be true appears not to be the case. Take the following data set as an example: df - data.frame(x = runif(100, 0, 100)) df$y - df$x + 1 + rnorm(100, sd=15) I had expected that: summary(lm(y ~ x, data=df, weights=rep(2, 100))) summary(lm(y ~ x, data=rbind(df,df))) You assign weights to different points according to some external quality or reliability measure not number of times the data point was measured. That is one type of weighting - but what if I have already aggregated data? That is a perfectly valid type of weighting too. Look at the estimates and standard error of the two models below: coefficients( summary(f.w - lm(y ~ x, data=df, weights=rep(2, 100))) ) Estimate Std. Error t value Pr(|t|) (Intercept) 1.940765 3.30348066 0.587491 5.582252e-01 x 0.982610 0.05893262 16.673448 2.264258e-30 coefficients( summary( f.u - lm(y ~ x, data=rbind(df,df) ) ) ) Estimate Std. Errort value Pr(|t|) (Intercept) 1.940765 2.32408609 0.8350659 4.046871e-01 x 0.982610 0.04146066 23.6998165 1.012067e-59 You can see that they have same coefficient estimates but the second one has smaller variances. The repeated values artificially deflates the variance and thus inflates the precision. This is why you cannot treat replicate data as independent observations. Hardly artificially - I have repeated observations. would be equivalent, but they are not. I suspect the difference is how the degrees of freedom is calculated - I had expected it to be sum(weights), but seems to be sum(weights 0). This seems unintuitive to me: summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50))) summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50))) What am I missing? And what is the usual way to do a linear regression when you have aggregated data? I would be best to use the individual data points instead of aggregated data as it allows you to estimate the within-group variations as well. There is no within group variation - these are observations that occur with same values many times in the dataset, so have been aggregated into the a contingency table-like format. If you had individual data points, you could try something as follows. Please check the codes as I am no expert in the area of repeated measures. x - runif(100, 0, 100) y1 - x + rnorm(100, mean=1, sd=15) y2 - y1 + rnorm(100, sd=5) df - data.frame( y=c(y1, y2), x=c(x,x), subject=factor(rep( paste(p, 1:100, sep=), 2 ) )) library(nlme) summary( lme( y ~ x, random = ~ 1 | subject, data=df ) ) Try reading Pinheiro and Bates (http://tinyurl.com/yvvrr7) or related material for more information. Hope this helps. I'm not interested in a mixed model, and I don't have individual data points. Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting informative error messages
Hi, Tony: Are you familiar with the 'debug' command? I agree that more informative error messages and 'traceback' would be nice, but I've found the 'debug' facility quite useful. [I even sometimes prepare a shell of a function 'fn', then say debug(fn) and fn(), and complete writing the function in its native environment where I can more easily check what each step does.] I've heard that 'debug' does not work will with S4 class generics, but I have not so far had to deal with that. {There is also a 'debug' package, which is completely separate from the debug command in the 'base' package. I've heard that it has more extensive capabilities, but I've never used it.} I suspect you may already know 'debug', but for those who don't, I think it's worth noting its utility for this kind of thing. Hope this helps. Spencer Graves Tony Plate wrote: Certain errors seem to generate messages that are less informative than most -- they just tell you which function an error happened in, but don't indicate which line or expression the error occurred in. Here's a toy example: f - function(x) {a - 1; y - x[list(1:3)]; b - 2; return(y)} options(error=NULL) f(1:3) Error in f(1:3) : invalid subscript type traceback() 1: f(1:3) In this function, it's clear that the error is in subscripting 'x', but it's not always so immediately obvious in lengthier functions. Is there anything I can do to get a more informative error message in this type of situation? I couldn't find any help in the section Debugging R Code in R-exts (or anything at all relevant in R-intro). (Different values for options(error=...) and different formatting of the function made no difference.) -- Tony Plate sessionInfo() R version 2.5.0 (2007-04-23) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: tap.misc 1.0 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad optimization solution
Paul, The problem lies neither with R nor with numercial methods. The onus is always on the user to understand what the numerical schemes can do and what they can't do. One should never blindly take the results given by a numerical scheme and run with it. In your example, the optimization method is doing what it was designed to do: to find a critical point of a function where the gradient is zero. It is your responsibility to ensure that the result makes sense, and if it doesn't, to understand why it doesn't make sense. In your problem, maxima ((1,0) and (0,1)) lie on the boundary of the parameter space, and the gradient at the maxima (defined as the limit from within the domain) are clearly not zero. Another problem with your example is that the hessian for your function is singular, it has eigenvalues of 0 and 2. In short, there is no substitute to using your analytic powers! Ravi. - Original Message - From: Paul Smith [EMAIL PROTECTED] Date: Tuesday, May 8, 2007 4:33 am Subject: Re: [R] Bad optimization solution To: R-help r-help@stat.math.ethz.ch It seems that there is here a problem of reliability, as one never knows whether the solution provided by R is correct or not. In the case that I reported, it is fairly simple to see that the solution provided by R (without any warning!) is incorrect, but, in general, that is not so simple and one may take a wrong solution as a correct one. Paul On 5/8/07, Ravi Varadhan [EMAIL PROTECTED] wrote: Your function, (x1-x2)^2, has zero gradient at all the starting values such that x1 = x2, which means that the gradient-based search methods will terminate there because they have found a critical point, i.e. a point at which the gradient is zero (which can be a maximum or a minimum or a saddle point). However, I do not why optim converges to the boundary maximum, when analytic gradient is supplied (as shown by Sundar). Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: -Original Message- From: [EMAIL PROTECTED] [ On Behalf Of Paul Smith Sent: Monday, May 07, 2007 6:26 PM To: R-help Subject: Re: [R] Bad optimization solution On 5/7/07, Paul Smith [EMAIL PROTECTED] wrote: I think the problem is the starting point. I do not remember the details of the BFGS method, but I am almost sure the (.5, .5) starting point is suspect, since the abs function is not differentiable at 0. If you perturb the starting point even slightly you will have no problem. Paul Smith [EMAIL PROTECTED] To Sent by: R-help r-help@stat.math.ethz.ch [EMAIL PROTECTED] cc at.math.ethz.ch Subject [R] Bad optimization solution 05/07/2007 04:30 PM Dear All I am trying to perform the below optimization problem, but getting (0.5,0.5) as optimal solution, which is wrong; the correct solution should be (1,0) or (0,1). Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux). Thanks in advance, Paul -- myfunc - function(x) { x1 - x[1] x2 - x[2] abs(x1-x2) } optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control= list(fnscale=-1)) Yes, with (0.2,0.9), a correct solution comes out. However, how can one be sure in general that the solution obtained by optim is correct? In ?optim says: Method 'L-BFGS-B' is that of Byrd _et. al._ (1995) which allows _box constraints_, that is each variable can be given a lower and/or upper bound. The initial value must satisfy the constraints. This uses a limited-memory modification of the BFGS quasi-Newton method. If non-trivial bounds are supplied, this method will be selected, with a warning. which only demands that the initial value must satisfy the constraints. Furthermore, X^2 is everywhere differentiable and notwithstanding the reported problem occurs with myfunc - function(x) { x1 - x[1] x2 - x[2] (x1-x2)^2 } optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control= list(fnscale=-1)) Paul
[R] strange behavior in data frames with duplicated column names
Dear R gurus, There is an interesting problem with accessing specific items in a column of data frame that has incorrectly been given a duplicate name, even though addressing the item by row and column number. Although the column is correctly listed, an item addressed by row and column number gives the item with the correct row and the original not the duplicated column. Here are the instructions to get this problem x - matrix(seq(1:12),ncol=3) colnames(x) - c(A,B,A) #a redundant name for column 2 x.df - data.frame(x) x.df#the redundant name is corrected x.df[,3]#show the column -- this always works x.df[2,3] #this works here #now incorrectly label the columns with a duplicate name colnames(x.df) - c(A,B,A) #the redundant name is not detected x.df x.df[,3] #this works as above and shows the column x.df[2,3]#but this gives the value of the first column, not the third --- rownames(x.df) - c(First,Second,Third,Third) #detects duplicate name x.df x.df[4,] #correct second row and corrected column names! x.df[4,3]#wrong column x.df #still has the original names with the duplication and corresponding output: x - matrix(seq(1:12),ncol=3) colnames(x) - c(A,B,A) #a redundant name for column 2 x.df - data.frame(x) x.df#the redundant name is corrected A B A.1 1 1 5 9 2 2 6 10 3 3 7 11 4 4 8 12 x.df[,3]#show the column -- this always works [1] 9 10 11 12 x.df[2,3] #this works here [1] 10 #now incorrectly label the columns with a duplicate name colnames(x.df) - c(A,B,A) #the redundant name is not detected x.df A B A 1 1 5 9 2 2 6 10 3 3 7 11 4 4 8 12 x.df[,3] #this works as above and shows the column [1] 9 10 11 12 x.df[2,3]#but this gives the value of the first column, not the third --- [1] 2 rownames(x.df) - c(First,Second,Third,Third) #detects duplicate name Error in `row.names-.data.frame`(`*tmp*`, value = c(First, Second, : duplicate 'row.names' are not allowed x.df A B A 1 1 5 9 2 2 6 10 3 3 7 11 4 4 8 12 x.df[4,] #correct second row and corrected column names! A B A.1 4 4 8 12 x.df[4,3]#wrong column [1] 4 x.df #still has the original names with the duplication unlist(R.Version()) platform archos i386-apple-darwin8.9.1 i386 darwin8.9.1 system status major i386, darwin8.9.1 Patched 2 minor year month 5.0 2007 04 day svn rev language 25 41315 R version.string R version 2.5.0 Patched (2007-04-25 r41315) Bill -- William Revelle http://personality-project.org/revelle.html Professor http://personality-project.org/personality.html Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern University http://www.northwestern.edu/ Use R for statistics: http://personality-project.org/r __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looking for a comprehensive descriptive statistics package
Hi all, I'm looking for a package that will give me comprehensive set of descriptive statistics, including skew and kurtosis. Also, is there a similar package that will provide multivariate descriptive statistics as well? Thanks in advance, David -- === David Kaplan, Ph.D. Professor Department of Educational Psychology University of Wisconsin - Madison Educational Sciences, Room, 1061 1025 W. Johnson Street Madison, WI 53706 email: [EMAIL PROTECTED] homepage: http://www.education.wisc.edu/edpsych/facstaff/kaplan/kaplan.htm Phone: 608-262-0836 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weighted least squares
Dear Hadley, I think that the problem is that the term weights has different meanings, which, although they are related, are not quite the same. The weights used by lm() are (inverse-)variance weights, reflecting the variances of the errors, with observations that have low-variance errors therefore being accorded greater weight in the resulting WLS regression. What you have are sometimes called case weights, and I'm unaware of a general way of handling them in R, although you could regenerate the unaggregated data. As you discovered, you get the same coefficients with case weights as with variance weights, but different standard errors. Finally, there are sampling weights, which are inversely proportional to the probability of selection; these are accommodated by the survey package. To complicate matters, this terminology isn't entirely standard. I hope this helps, John John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of hadley wickham Sent: Tuesday, May 08, 2007 5:09 AM To: R Help Subject: [R] Weighted least squares Dear all, I'm struggling with weighted least squares, where something that I had assumed to be true appears not to be the case. Take the following data set as an example: df - data.frame(x = runif(100, 0, 100)) df$y - df$x + 1 + rnorm(100, sd=15) I had expected that: summary(lm(y ~ x, data=df, weights=rep(2, 100))) summary(lm(y ~ x, data=rbind(df,df))) would be equivalent, but they are not. I suspect the difference is how the degrees of freedom is calculated - I had expected it to be sum(weights), but seems to be sum(weights 0). This seems unintuitive to me: summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50))) summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50))) What am I missing? And what is the usual way to do a linear regression when you have aggregated data? Thanks, Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plots - scatterplots - find index of a point in a list
On 08/05/07, John Kane [EMAIL PROTECTED] wrote: Try ?locator Thanks. Your tip also lead to another function: ?identify to add my two cents. --- [EMAIL PROTECTED] wrote: Hi, is it possible to find the index of a point of a plot (e.g. scatterplot) in an easy way? Eg. x - c(1:5); y - c(1:5); plot(x, y); On the plot if I move my cursor on top of a point or click on it is it possible to have its index printed or its exact value? Any clues? Thanks. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail at http://mrd.mail.yahoo.com/try_beta?.intl=ca [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad optimization solution
Paul, You have picked a function that is not smoothly differentiable and also started at one of many 'stationary' points in a system with multiple solutions. In practice, I think it'll get a zero gradient as the algorithm does things numerically and you have a symmetric function. It probably then chooses gradient-related step sizes of zero and goes nowhere, converging instantly. The same happens at (0.1,0.1) and anywhere else along x=y. The problem affects pretty much all gradient-only algorithms handed stationary points in a symmetric function. Solution? Ermm.. don't do that with a gradient method, I suspect, though wiser heads may have more to say on the topic. S Paul Smith [EMAIL PROTECTED] 07/05/2007 22:30:32 Dear All I am trying to perform the below optimization problem, but getting (0.5,0.5) as optimal solution, which is wrong; the correct solution should be (1,0) or (0,1). Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux). Thanks in advance, Paul -- myfunc - function(x) { x1 - x[1] x2 - x[2] abs(x1-x2) } optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=list(fnscale=-1)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This email and any attachments are confidential. Any use, co...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bad optimization solution
Thanks, Ravi, for your clear explanation! Paul On 5/8/07, RAVI VARADHAN [EMAIL PROTECTED] wrote: Paul, The problem lies neither with R nor with numercial methods. The onus is always on the user to understand what the numerical schemes can do and what they can't do. One should never blindly take the results given by a numerical scheme and run with it. In your example, the optimization method is doing what it was designed to do: to find a critical point of a function where the gradient is zero. It is your responsibility to ensure that the result makes sense, and if it doesn't, to understand why it doesn't make sense. In your problem, maxima ((1,0) and (0,1)) lie on the boundary of the parameter space, and the gradient at the maxima (defined as the limit from within the domain) are clearly not zero. Another problem with your example is that the hessian for your function is singular, it has eigenvalues of 0 and 2. In short, there is no substitute to using your analytic powers! Ravi. - Original Message - From: Paul Smith [EMAIL PROTECTED] Date: Tuesday, May 8, 2007 4:33 am Subject: Re: [R] Bad optimization solution To: R-help r-help@stat.math.ethz.ch It seems that there is here a problem of reliability, as one never knows whether the solution provided by R is correct or not. In the case that I reported, it is fairly simple to see that the solution provided by R (without any warning!) is incorrect, but, in general, that is not so simple and one may take a wrong solution as a correct one. Paul On 5/8/07, Ravi Varadhan [EMAIL PROTECTED] wrote: Your function, (x1-x2)^2, has zero gradient at all the starting values such that x1 = x2, which means that the gradient-based search methods will terminate there because they have found a critical point, i.e. a point at which the gradient is zero (which can be a maximum or a minimum or a saddle point). However, I do not why optim converges to the boundary maximum, when analytic gradient is supplied (as shown by Sundar). Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: -Original Message- From: [EMAIL PROTECTED] [ On Behalf Of Paul Smith Sent: Monday, May 07, 2007 6:26 PM To: R-help Subject: Re: [R] Bad optimization solution On 5/7/07, Paul Smith [EMAIL PROTECTED] wrote: I think the problem is the starting point. I do not remember the details of the BFGS method, but I am almost sure the (.5, .5) starting point is suspect, since the abs function is not differentiable at 0. If you perturb the starting point even slightly you will have no problem. Paul Smith [EMAIL PROTECTED] To Sent by: R-help r-help@stat.math.ethz.ch [EMAIL PROTECTED] cc at.math.ethz.ch Subject [R] Bad optimization solution 05/07/2007 04:30 PM Dear All I am trying to perform the below optimization problem, but getting (0.5,0.5) as optimal solution, which is wrong; the correct solution should be (1,0) or (0,1). Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux). Thanks in advance, Paul -- myfunc - function(x) { x1 - x[1] x2 - x[2] abs(x1-x2) } optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control= list(fnscale=-1)) Yes, with (0.2,0.9), a correct solution comes out. However, how can one be sure in general that the solution obtained by optim is correct? In ?optim says: Method 'L-BFGS-B' is that of Byrd _et. al._ (1995) which allows _box constraints_, that is each variable can be given a lower and/or upper bound. The initial value must satisfy the constraints. This uses a limited-memory modification of the BFGS quasi-Newton method. If non-trivial bounds are supplied, this method will be selected, with a warning. which only demands that the initial value must satisfy the constraints. Furthermore, X^2 is everywhere differentiable and
Re: [R] Weighted least squares
Doubling the length of the data doubles the apparent number of observations. You would expect the standard error to reduce by sqrt(2) (which it just about does, though I'm not clear on why its not exact here) Weights are not as simple as they look. You have given all your data the same weight, so the answer is independent of the weights (!). Try again with weights=rep(4,100) etc. Equal weights simply cancel out in the lm process. In fact, some linear regression algorithms rescale all weights to sum to 1; in others, weights are scaled to average 1; done 'naturally' the weights simply appear in two places which cancel out in the final covariance matrix calculation (eg in the weighted 'residual sd' and in the hessian for the chi-squared function, if I remember correctly). Bottom line - equal weights make no difference in lm, so choose what you like. 1 is a good number, though. Steve e hadley wickham [EMAIL PROTECTED] 08/05/2007 10:08:34 Dear all, I'm struggling with weighted least squares, where something that I had assumed to be true appears not to be the case. Take the following data set as an example: *** This email and any attachments are confidential. Any use, co...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting informative error messages
It is not clear to me what you want here. Errors are tagged by a 'call', and f(1:3) is the innermost 'call' (special primitives do not set a context and so do not count if you consider '[' to be a function). The message could tell you what the type was, but it does not and we have lost the pool of active contributors we once had to submit tested patches for things like that. On Mon, 7 May 2007, Tony Plate wrote: Certain errors seem to generate messages that are less informative than most -- they just tell you which function an error happened in, but don't indicate which line or expression the error occurred in. Here's a toy example: f - function(x) {a - 1; y - x[list(1:3)]; b - 2; return(y)} options(error=NULL) f(1:3) Error in f(1:3) : invalid subscript type traceback() 1: f(1:3) In this function, it's clear that the error is in subscripting 'x', but it's not always so immediately obvious in lengthier functions. Is there anything I can do to get a more informative error message in this type of situation? I couldn't find any help in the section Debugging R Code in R-exts (or anything at all relevant in R-intro). (Different values for options(error=...) and different formatting of the function made no difference.) -- Tony Plate sessionInfo() R version 2.5.0 (2007-04-23) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: tap.misc 1.0 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The match of ave() for FUN=SD
Hello, I have many subsets of x. I want to get the standard deviation for each subset with the same factor levels. For FUN=mean and FUN=median I am using ave(). Can anybody tell me the match of ave() for using FUN=SD? At the beginning I used aggregate(), also for mean and median. But aggregegate make arithmetical errors in computing huge records. Thank's a lot. Felix R-CODE: --- MEAN - ave(INPUT[,3], INPUT[,1], INPUT[,2], FUN = mean) mMEAN - matrix( c(INPUT[,1], INPUT[,2], MEAN), ncol=3, byrow=FALSE) SD - na.omit( aggregate(INPUT[,3], by=list(INPUT[,2],INPUT[,1] ), FUN=sd ) ) mSD - matrix( c( SD[,2],SD[,1], SD[,3]), ncol=3, byrow=FALSE) mSD[,1] - (mSD[,1] - 1 )#eleminate the converting-fault list- matrix mSD[,2] - (mSD[,2] - 1 ) / 10 #eleminate the converting-fault list- matrix the arithmetical errors: list - matrix 0 - 11 1 - 2 2.1 - 22 3.5 - 36 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question on bivariate GEE fit
Hi, I have a bivariate longitudinal dataset. As an example say, i have the data frame with column names var1 var2 Unit time trt (trt represents the treatment) Now suppose I want to fit a joint model of the form for the *i* th unit var1jk = alpha1 + beta1*timejk + gamma1* trtjk + delta1* timejk:trtjk + error1jk var2 = alpha2 + beta2*timejk + gamma2* trtjk + delta2* timejk:trtjk + error2jk where j index time and k index the treatment received Using indicator variables I have been able to fit and run the code for a bivariate model using unstructured covariance matrix. However, I want to fit a model for a structured variance covariance matrix. The error structure for the grouping unit is as follows sigma = ( sigma1 sigma12 ) ( sigma12 sigma2) sigma1, sigma2 and sigma12 are matrices with where sigma1 = sig1 * AR1(rho1) sigma2 = sig2* AR1(rho2) sigma12 = sig12 * AR1(rho12) My question is whether there is any method to fit such data using packages like gee or geepack (or may be any other package ) in R. The function genZcor() of geepack can be used to construct correlation but I have been unable to use it in the present context. Any help is greatly appreciated. Regards Souvik Banerjee Lecturer Department of statistics Memari College Burdwan India [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot time series
Dear All, I sended my first mail as HTML by accident. It has probably been stripped off... (see first mail below) During that time, I was actually able to find a solution to my problem : I wanted to plot times on a graph representing precipitation=f(time) here is an example: time-c(2004-10-18 17:20:00,2004-10-18 17:50:00 ,2004-10-18 18:40:00,2004-10-18 19:50:00,2004-10-18 20:00:00 ,2004-10-18 20:10:00,2004-10-18 21:20:00 ,2004-10-19 22:00:00 ,2004-10-20 23:40:00) precipitation-c(0.1,0.5,0.0,0.8,1,2,5,9,1) tt-strptime(time,%Y-%m-%d %H:%M:%S) plot(tt,precipitation,xlab=time, xaxt=n) r-as.POSIXct(round(range(tt),hours)) axis.POSIXct(1,tt,at=seq(r[1],r[2],by=hour),format=%Y-%m-%d %H:%M) If you have better solution, I would be happy to know them, Thanks in advance, Jessica -Jessica Gervais/CRTE/TUDOR wrote: - To: R-help@stat.math.ethz.ch From: Jessica Gervais/CRTE/TUDOR Date: 05/08/2007 03:37PM Subject: plot time series Dear all, I have a question concerning plotting time measurements. I have a time serie which record precipitation at different time steps for different meteo sations. Data are stored into a table : first column is time (time steps between 2 measurement are variables) secondcolumn is the measurement I would like to plot precipitation=f(time) and write time in axis. I can not use the plot.ts function as time steps between 2 measurements are variables. I don't want to do a plot.ts as time steps are variable and also as I would like the different times to be written on the x-axis.I also would like space between each time step on the x-axis to be representativ of the real amount of time. Here is a example time-c(2004-10-18 17:20:00,2004-10-18 17:50:00 ,2004-10-18 18:40:00,2004-10-18 19:50:00,2004-10-18 20:00:00 ,2004-10-18 20:10:00,2004-10-18 21:20:00 ,2004-10-18 22:00:00 ,2004-10-18 23:40:00) (is it automatically recognized as a time format?) precipitation-c(0.1,0.5,0.0,0.8,1,2,5,0.2,3) plot(time,precipitation) Error in plot.window(xlim, ylim, log, asp, ...) : need finite 'xlim' values In addition: Warning messages: 1: NAs introduced by coercion 2: no non-missing arguments to min; returning Inf 3: no non-missing arguments to max; returning -Inf Do anyone knows how to plot this kind os time dependant datas ? Thanks in advance, Regards, Jessica __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with systemfit
Hi - I encounter two problems with SystemFit. I have a matrix of 20 variables (38 observations each). I am trying to fit using 2SLS in Systemfit because I want to be able to simulate new observations with the resulting model. The first problem I found is that it ran out of memory given that I have 2GB of RAM. How can I circumvent this? Can I do some mixing or bootstrapping with small sample each time instead?Another problem is I got system is computationally singular when I reduce the size of observation to 10 instead of 38. Any help would be appreciated. I think my variables are not really well correlated to that extent. adschai [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum from matrix
Thanks it worked. Oarabile Gabor Csardi wrote: apply(a, 2, function(x) min(x[x!=0]) ) should do it. Might need some improvement if all numbers in a column can be zero, try it. Gabor On Tue, May 08, 2007 at 09:50:43AM +0100, [EMAIL PROTECTED] wrote: I have a very large matrix with columns that have some of their entries as zero A small example if a= [,1] [,2] [,3] [,4] [,1] 0 2 0 0 [,2] 1 3 0 3 [,3] 2 0 3 5 [,4] 0 4 0 0 and what to get the minimum number from each column but that number should not be zero. If I use apply (a,2,min) I will get a vector of zeros as the minimum but what I want it for example from column 1 I should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I wonder if someone can give an idea on how to go about it. thanks in advance for your help. Oarabile __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum from matrix
Thanks it worked Oarabile Dimitris Rizopoulos wrote: try this: apply(a, 2, function(x) min(x[x 0])) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Tuesday, May 08, 2007 10:50 AM Subject: [R] minimum from matrix I have a very large matrix with columns that have some of their entries as zero A small example if a= [,1] [,2] [,3] [,4] [,1] 0 2 0 0 [,2] 1 3 0 3 [,3] 2 0 3 5 [,4] 0 4 0 0 and what to get the minimum number from each column but that number should not be zero. If I use apply (a,2,min) I will get a vector of zeros as the minimum but what I want it for example from column 1 I should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I wonder if someone can give an idea on how to go about it. thanks in advance for your help. Oarabile __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with memory problem in SystemFit
Hi - I encounter two problems with SystemFit. I have a matrix of 20 variables (38 observations each). I am trying to fit using 2SLS in Systemfit because I want to be able to simulate new observations with the resulting model. The first problem I found is that it ran out of memory given that I have 2GB of RAM. How can I circumvent this? Can I do some mixing or bootstrapping with small sample each time instead?Another problem is I got system is computationally singular when I reduce the size of observation to 10 instead of 38. Any help would be appreciated. I think my variables are not really well correlated to that extent. adschai [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ordered logistic regression with random effects. Howto?
Paul, I believe the model you describe below can be fitted in GENMOD and GLIMMIX in SAS. Alternatively, as Brian Ripley suggests, you could use MCMC. BUGS has a nice example of a multinomial logit model in the second example manual. While this example considers only fixed effects, it's not difficult to extend the model to include a random effect (see the 'Seeds' example in the first example manual). Regards, -Cody Paul Johnson [EMAIL PROTECTED] .com To Sent by: r-help@stat.math.ethz.ch [EMAIL PROTECTED] cc at.math.ethz.ch Subject [R] ordered logistic regression 05/07/2007 07:03 with random effects. Howto? PM I'd like to estimate an ordinal logistic regression with a random effect for a grouping variable. I do not find a pre-packaged algorithm for this. I've found methods glmmML (package: glmmML) and lmer (package: lme4) both work fine with dichotomous dependent variables. I'd like a model similar to polr (package: MASS) or lrm (package: Design) that allows random effects. I was thinking there might be a trick that might allow me to use a program written for a dichotomous dependent variable with a mixed effect to estimate such a model. The proportional odds logistic regression is often written as a sequence of dichotomous comparisons. But it seems to me that, if it would work, then somebody would have proposed it already. I've found some commentary about methods of fitting ordinal logistic regression with other procedures, however, and I would like to ask for your advice and experience with this. In this article, Ching-Fan Sheu, Fitting mixed-effects models for repeated ordinal outcomes with the NLMIXED procedure Behavior Research Methods, Instruments, Computers, 2002, 34(2): 151-157. the other gives an approach that works in SAS's NLMIXED procedure. In this approach, one explicitly writes down the probability that each level will be achieved (using the linear predictor and constants for each level). I THINK I could find a way to translate this approach into a model that can be fitted with either nlme or lmer. Has someone done it? What other strategies for fitting mixed ordinal models are there in R? Finally, a definitional question. Is a multi-category logistic regression (either ordered or not) a member of the glm family? I had thought the answer is no, mainly because glm and other R functions for glms never mention multi-category qualitative dependent variables and also because the distribution does not seem to fall into the exponential family. However, some textbooks do include the multicategory models in the GLM treatment. -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a new column
thanks for the help But l ened up using act.surv.time-pmin(m[,censoringTime],m[,survivalTime]) m-cbind(m,act.surv.time) which seems to work thanks Schmitt, Corinna [EMAIL PROTECTED] wrote: Hallo, I just tried, if my solution might be possible for you. Here is the result: s=2 while(s!=0){ n=20 + m-matrix(nrow=n,ncol=4) + colnames(m)=c(treatmentgrp,strata,censoringTime,survivalTime) + for(i in 1:20) m[i,]-c(sample(c(1,2),1,replace=TRUE),sample(c(1,2),1,replace=TRUE),rexp(1,.007),rexp(1,.002)) + m-cbind(m,0) + m[m[,3]m[,4],5]-1 + colnames(m)[5]-censoring + print(m) +s=s-1 + } #bilding a data frame from m x=data.frame(m) # adding a column, where nrow(x) = number of row in x and in the # new coulmn should stand the entries 1...20. x=data.frame(x,actual surv time=c(1:nrow(x))) x treatmentgrp strata censoringTime survivalTime censoring actual.surv.time 1 1 2377.486125 1070.66287 01 2 1 2242.468604 1061.30474 02 3 1 2 40.904656 51.88263 03 4 2 2 44.025595590.15317 04 5 1 1253.093279247.32141 15 6 2 2 50.486272257.25016 06 7 1 1337.591250554.05931 07 8 2 2 74.905075873.14563 08 9 1 2 57.196581765.43142 09 101 2370.147307 1646.65368 0 10 111 2152.738532480.12355 0 11 122 2 15.313303139.19791 0 12 131 2 17.205624641.15764 0 13 142 1 81.753924107.02202 0 14 151 2 60.774221665.27500 0 15 162 1 8.712562142.90775 0 16 171 1 54.542722 1904.88060 0 17 182 2 85.626140214.66811 0 18 192 1 31.257923739.96591 0 19 201 1 85.910141306.14860 0 20 See it works! For more information of data frames, see ?data.frame Here are some additional examples. x[2,6] # Extraction of special entries: x[2,6]= 2 - row 2, column 6 [1] 2 x[2,6]=100 # Changing entires: x[2,6]=100 x treatmentgrp strata censoringTime survivalTime censoring actual.surv.time 1 1 2377.486125 1070.66287 01 2 1 2242.468604 1061.30474 0 100 3 1 2 40.904656 51.88263 03 4 2 2 44.025595590.15317 04 5 1 1253.093279247.32141 15 6 2 2 50.486272257.25016 06 7 1 1337.591250554.05931 07 8 2 2 74.905075873.14563 08 9 1 2 57.196581765.43142 09 101 2370.147307 1646.65368 0 10 111 2152.738532480.12355 0 11 122 2 15.313303139.19791 0 12 131 2 17.205624641.15764 0 13 142 1 81.753924107.02202 0 14 151 2 60.774221665.27500 0 15 162 1 8.712562142.90775 0 16 171 1 54.542722 1904.88060 0 17 182 2 85.626140214.66811 0 18 192 1 31.257923739.96591 0 19 201 1 85.910141306.14860 0 20 t=x row2=t[-2,] # deleting a row row2 treatmentgrp strata censoringTime survivalTime censoring actual.surv.time 1 1 2377.486125 1070.66287 01 3 1 2 40.904656 51.88263 03 4 2 2 44.025595590.15317 04 5 1 1253.093279247.32141 15 6 2 2 50.486272257.25016 06 7 1 1337.591250554.05931 07 8 2 2 74.905075873.14563 0
Re: [R] plot time series
Dear all, I actually would like to improve the label orientation on the x-axis (turn them to 45 degrees) I tried the par(las=2) ... but doesn't work... Do anyone knows how to do ? Jessica __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weighted least squares
Hadley, You asked .. what is the usual way to do a linear regression when you have aggregated data? Least squares generally uses inverse variance weighting. For aggregated data fitted as mean values, you just need the variances for the _means_. So if you have individual means x_i and sd's s_i that arise from aggregated data with n_i observations in group i, the natural weighting is by inverse squared standard error of the mean. The appropriate weight for x_i would then be n_i/(s_i^2). In R, that's n/(s^2), as n and s would be vectors with the same length as x. If all the groups had the same variance, or nearly so, s is a scalar; if they have the same number of observations, n is a scalar. Of course, if they have the same variance and same number of observations, they all have the same weight and you needn't weight them at all: see previous posting! Steve E *** This email and any attachments are confidential. Any use, co...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] statistics/correlation question NOT R question
This is not an R question but if anyone can help me, it's much appreciated. Suppose I have a series ( stationary ) y_t and a series x_t ( stationary )and x_t has variance sigma^2_x and epsilon is normal (0, sigma^2_epsilon ) and the two series have the relation y_t = Beta*x_t + epsilon My question is if there are particular values that sigma^2_x and sigma^2_epsilon have to take in order for corr(x_t,y_t) to equal Beta ? I attempted to figure this out using two different methods and in one case I end up involving sigma^2_epsilon and in the other I don't and I'm not sure if either method is correct. I think I need to use results form the conditional bivariate normal but i'm really not sure. Also, it's not a homework problem because I am too old to have homework. Thanks for any insights/solutions. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting informative error messages
Dear Prof. Ripley: 1. I very much appreciate your contributions to the R project. 2. Whether with release 2.4.0 or earlier, I noticed that 'traceback()' had become less informative. This loss was more than offset when I learned to use the 'debug' function in the 'base' package: It increased my productivity so much that it helped complete my transition from S-Plus: The last few times I had to use S-Plus, I ported them to R, got the code working using 'debug', then ported the results back to S-Plus. That was quicker for me than debugging directly in S-Plus. 3. Thanks again for your many contributions to the R project and to my education more generally. Best Wishes, Spencer Graves Prof Brian Ripley wrote: It is not clear to me what you want here. Errors are tagged by a 'call', and f(1:3) is the innermost 'call' (special primitives do not set a context and so do not count if you consider '[' to be a function). The message could tell you what the type was, but it does not and we have lost the pool of active contributors we once had to submit tested patches for things like that. On Mon, 7 May 2007, Tony Plate wrote: Certain errors seem to generate messages that are less informative than most -- they just tell you which function an error happened in, but don't indicate which line or expression the error occurred in. Here's a toy example: f - function(x) {a - 1; y - x[list(1:3)]; b - 2; return(y)} options(error=NULL) f(1:3) Error in f(1:3) : invalid subscript type traceback() 1: f(1:3) In this function, it's clear that the error is in subscripting 'x', but it's not always so immediately obvious in lengthier functions. Is there anything I can do to get a more informative error message in this type of situation? I couldn't find any help in the section Debugging R Code in R-exts (or anything at all relevant in R-intro). (Different values for options(error=...) and different formatting of the function made no difference.) -- Tony Plate sessionInfo() R version 2.5.0 (2007-04-23) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: tap.misc 1.0 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] censoring
in R when carring out the log rank test is the censored variable denoted by 1 or 0 or its of no consequence. thanks - always stay connected to friends. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strange behavior in data frames with duplicated column names
First, you should not be using colnames-, which is for a matrix, on a data frame. Use names- for data frames (and as.data.frame to convert to a data frame). Second, whereas duplicate row names are not allowed in a data frame, duplicate column names are but at your own risk. Third, there is a 'optimization too far' here which I will change in 2.5.0 patched. Often with R development there is a tradeoff between speed and generality. On Tue, 8 May 2007, William Revelle wrote: Dear R gurus, There is an interesting problem with accessing specific items in a column of data frame that has incorrectly been given a duplicate name, even though addressing the item by row and column number. Although the column is correctly listed, an item addressed by row and column number gives the item with the correct row and the original not the duplicated column. Here are the instructions to get this problem x - matrix(seq(1:12),ncol=3) colnames(x) - c(A,B,A) #a redundant name for column 2 x.df - data.frame(x) x.df#the redundant name is corrected x.df[,3]#show the column -- this always works x.df[2,3] #this works here #now incorrectly label the columns with a duplicate name colnames(x.df) - c(A,B,A) #the redundant name is not detected x.df x.df[,3] #this works as above and shows the column x.df[2,3]#but this gives the value of the first column, not the third --- rownames(x.df) - c(First,Second,Third,Third) #detects duplicate name x.df x.df[4,] #correct second row and corrected column names! x.df[4,3]#wrong column x.df #still has the original names with the duplication and corresponding output: x - matrix(seq(1:12),ncol=3) colnames(x) - c(A,B,A) #a redundant name for column 2 x.df - data.frame(x) x.df#the redundant name is corrected A B A.1 1 1 5 9 2 2 6 10 3 3 7 11 4 4 8 12 x.df[,3]#show the column -- this always works [1] 9 10 11 12 x.df[2,3] #this works here [1] 10 #now incorrectly label the columns with a duplicate name colnames(x.df) - c(A,B,A) #the redundant name is not detected x.df A B A 1 1 5 9 2 2 6 10 3 3 7 11 4 4 8 12 x.df[,3] #this works as above and shows the column [1] 9 10 11 12 x.df[2,3]#but this gives the value of the first column, not the third --- [1] 2 rownames(x.df) - c(First,Second,Third,Third) #detects duplicate name Error in `row.names-.data.frame`(`*tmp*`, value = c(First, Second, : duplicate 'row.names' are not allowed x.df A B A 1 1 5 9 2 2 6 10 3 3 7 11 4 4 8 12 x.df[4,] #correct second row and corrected column names! A B A.1 4 4 8 12 x.df[4,3]#wrong column [1] 4 x.df #still has the original names with the duplication unlist(R.Version()) platform archos i386-apple-darwin8.9.1 i386 darwin8.9.1 system status major i386, darwin8.9.1 Patched 2 minor year month 5.0 2007 04 day svn rev language 25 41315 R version.string R version 2.5.0 Patched (2007-04-25 r41315) Bill -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] statistics/correlation question NOT R question
Mark, I suppose you make the usual assumptions, ie. E[x]=0, E[x*epsilon]=0, the correlation is just simply, corr(x,y) = beta * ( var(x) / var(y) ) And you could get var(y) from var(x) and var(epsilon). HTH. Horace Leeds, Mark (IED) [EMAIL PROTECTED] 5/8/2007 10:25:11 AM This is not an R question but if anyone can help me, it's much appreciated. Suppose I have a series ( stationary ) y_t and a series x_t ( stationary )and x_t has variance sigma^2_x and epsilon is normal (0, sigma^2_epsilon ) and the two series have the relation y_t = Beta*x_t + epsilon My question is if there are particular values that sigma^2_x and sigma^2_epsilon have to take in order for corr(x_t,y_t) to equal Beta ? I attempted to figure this out using two different methods and in one case I end up involving sigma^2_epsilon and in the other I don't and I'm not sure if either method is correct. I think I need to use results form the conditional bivariate normal but i'm really not sure. Also, it's not a homework problem because I am too old to have homework. Thanks for any insights/solutions. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] censoring
Well, I guess it makes quite a difference in survival analysis whether you know that a person was alive/censored or experienced the event of interest at a certain point of time/age. You could have tried it easily for yourself by slightly modifying the example on the help page of 'survdiff'. library(survival) survdiff(Surv(futime, fustat) ~ rx,data=ovarian) survdiff(Surv(futime, 1-fustat) ~ rx,data=ovarian) If you want to work more in survival analysis, I can recommend the book John P. Klein / Melvin L. Moeschberger (2003): Survival Analysis. Techniques for Censored and Truncated Data. Springer. (but it gives no recipes in R). Hope this helps, Roland raymond chiruka wrote: in R when carring out the log rank test is the censored variable denoted by 1 or 0 or its of no consequence. thanks - always stay connected to friends. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a comprehensive descriptive statistics package
Hi, summary(yourdata) is often a very good starting point for descriptive data statistics. Or you can write your own little function which returns what you actually like to see (the code below was written very quickly. No care is taken for the presence of missing values or anything else). exampledata - rnorm(1) summary(exampledata) desc - function(mydata) { require(e1071) quantls - quantile(x=mydata, probs=seq(from=0, to=1, by=0.25)) themean - mean(mydata) thesd - sd(mydata) kurt - kurtosis(mydata) skew - skewness(mydata) retlist - list(Quantiles=quantls, Mean=themean, StandDev=thesd, Skewness=skew, Kurtosis=kurt) return(retlist) } descstats - desc(exampledata) descstats I hope this helps, Roland David Kaplan wrote: Hi all, I'm looking for a package that will give me comprehensive set of descriptive statistics, including skew and kurtosis. Also, is there a similar package that will provide multivariate descriptive statistics as well? Thanks in advance, David __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a new column
I haven't been following your discussion, but based on your method, the following should also work and would be the one-line equivalent: m$act.surv.time-pmin(m[,censoringTime],m[,survivalTime]) In other words, you can add a column by just assigning a value to a new column name. You can also use the following syntax: m$act.surv.time-pmin(m$censoringTime,m$survivalTime) I just thought you might want to see different version of accomplishing the same thing. Roger. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of raymond chiruka Sent: Tuesday, May 08, 2007 12:36 PM To: Schmitt, Corinna; r Subject: Re: [R] creating a new column thanks for the help But l ened up using act.surv.time-pmin(m[,censoringTime],m[,survivalTime]) m-cbind(m,act.surv.time) which seems to work thanks Schmitt, Corinna [EMAIL PROTECTED] wrote: Hallo, I just tried, if my solution might be possible for you. Here is the result: s=2 while(s!=0){ n=20 + m-matrix(nrow=n,ncol=4) + colnames(m)=c(treatmentgrp,strata,censoringTime,survivalTime) + for(i in 1:20) m[i,]-c(sample(c(1,2),1,replace=TRUE),sample(c(1,2),1,replace=TRUE),rexp(1,.007),rexp(1,.002)) + m-cbind(m,0) + m[m[,3]m[,4],5]-1 + colnames(m)[5]-censoring + print(m) +s=s-1 + } #bilding a data frame from m x=data.frame(m) # adding a column, where nrow(x) = number of row in x and in the # new coulmn should stand the entries 1...20. x=data.frame(x,actual surv time=c(1:nrow(x))) x treatmentgrp strata censoringTime survivalTime censoring actual.surv.time 1 1 2377.486125 1070.66287 01 2 1 2242.468604 1061.30474 02 3 1 2 40.904656 51.88263 03 4 2 2 44.025595590.15317 04 5 1 1253.093279247.32141 15 6 2 2 50.486272257.25016 06 7 1 1337.591250554.05931 07 8 2 2 74.905075873.14563 08 9 1 2 57.196581765.43142 09 101 2370.147307 1646.65368 0 10 111 2152.738532480.12355 0 11 122 2 15.313303139.19791 0 12 131 2 17.205624641.15764 0 13 142 1 81.753924107.02202 0 14 151 2 60.774221665.27500 0 15 162 1 8.712562142.90775 0 16 171 1 54.542722 1904.88060 0 17 182 2 85.626140214.66811 0 18 192 1 31.257923739.96591 0 19 201 1 85.910141306.14860 0 20 See it works! For more information of data frames, see ?data.frame Here are some additional examples. x[2,6] # Extraction of special entries: x[2,6]= 2 - row 2, column 6 [1] 2 x[2,6]=100 # Changing entires: x[2,6]=100 x treatmentgrp strata censoringTime survivalTime censoring actual.surv.time 1 1 2377.486125 1070.66287 01 2 1 2242.468604 1061.30474 0 100 3 1 2 40.904656 51.88263 03 4 2 2 44.025595590.15317 04 5 1 1253.093279247.32141 15 6 2 2 50.486272257.25016 06 7 1 1337.591250554.05931 07 8 2 2 74.905075873.14563 08 9 1 2 57.196581765.43142 09 101 2370.147307 1646.65368 0 10 111 2152.738532480.12355 0 11 122 2 15.313303139.19791 0 12 131 2 17.205624641.15764 0 13 142 1 81.753924107.02202 0 14 151 2 60.774221665.27500 0 15 162 1 8.712562142.90775 0 16 171 1 54.542722 1904.88060 0 17 182 2 85.626140214.66811 0 18 192 1 31.257923739.96591 0 19 201 1 85.910141
[R] Separate x-axis ticks for panels in xyplot()?
Dear r-helpers, In an xyplot I have xyplot(y ~ x | c, groups = g, scales = list( x = list( at = v1, labels = c('A', 'B', 'C', 'D'), rot = 45 ) ) ) g consists of two groups. How do I define a different set of x ticks and labels for the second panel? I'm looking for something like xyplot(y ~ x | c, groups = g, scales = list( x = list(ifelse(g == g1, { at = v1, labels = c('A', 'B', 'C', 'D'), rot = 45}, {at = v2, labels = c('E', 'F' 'G), rot = 45} ) ) ) _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Representing a statistic as a colour on a 2d plot
Thanks people! Sorry to sound thick but after I download the poltrix package, how do i install it? Thank you! -- View this message in context: http://www.nabble.com/Representing-a-statistic-as-a-colour-on-a-2d-plot-tf3703885.html#a10383035 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot time series
The short answer is that you can't. The longer answer is that you need to replace them with text. Have a look at the FAQ 7.27 How can I create rotated axis labels? It provides a pretty good example. --- [EMAIL PROTECTED] wrote: Dear all, I actually would like to improve the label orientation on the x-axis (turn them to 45 degrees) I tried the par(las=2) ... but doesn't work... Do anyone knows how to do ? Jessica __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Representing a statistic as a colour on a 2d plot
How did you download it? In Windows if you used the Packages Install packages route it is installed. Just type library(plotrix) to load it. No idea for Linux or Mac. --- mister_bluesman [EMAIL PROTECTED] wrote: Thanks people! Sorry to sound thick but after I download the poltrix package, how do i install it? Thank you! -- View this message in context: http://www.nabble.com/Representing-a-statistic-as-a-colour-on-a-2d-plot-tf3703885.html#a10383035 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Representing a statistic as a colour on a 2d plot
mister_bluesman wrote: Thanks people! Sorry to sound thick but after I download the poltrix package, how do i install it? Thank you! which platform are you on? On Windows I would suggest you use the Menu of the GUI Install packages from local zip files in the main category 'packages' On Linux I usually do on the command line: R CMD INSTALL plotrix_2.2.tar.gz I hope this helps? Best, Roland P.S. I am pretty sure it is described somewhere in the included manuals. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Representing a statistic as a colour on a 2d plot
Thanks for you all your help guys. Much appreciated. -- View this message in context: http://www.nabble.com/Representing-a-statistic-as-a-colour-on-a-2d-plot-tf3703885.html#a10383998 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looking for a cleaner way to implement a setting certain indices of a matrix to 1 function
I wrote an ugly algorithm to set certain elements of a matrix to 1 without looping and below works and you can see what The output is below the code. K-6 lagnum-2 restrictmat-matrix(0,nrow=K,ncol=K*3) restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) (col(restrictmat)-row(restrictmat)) %% K == 0)]-1 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0 restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000100 0 0 0 0 0 0 0 0 0 [2,]010000010 0 0 0 0 0 0 0 0 0 [3,]001000001 0 0 0 0 0 0 0 0 0 [4,]000100000 1 0 0 0 0 0 0 0 0 [5,]000010000 0 1 0 0 0 0 0 0 0 [6,]000001000 0 0 1 0 0 0 0 0 0 For lagnum equals 1 , it also works : restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000000 0 0 0 0 0 0 0 0 0 [2,]010000000 0 0 0 0 0 0 0 0 0 [3,]001000000 0 0 0 0 0 0 0 0 0 [4,]000100000 0 0 0 0 0 0 0 0 0 [5,]000010000 0 0 0 0 0 0 0 0 0 [6,]000001000 0 0 0 0 0 0 0 0 0 But I am thinking that there has to be a better way particularly because I'll get an error if I set lagnum to 3. Any improvements or total revampings are appreciated. The number of columns will always be a multiple of the number of rows So K doesn't have to be 6. that was just to show what the commands do. thanks. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Skipping rows and columns in large matrix
I work on Windows, R version 2.4.1 I need to leave out many rows and columns when reading a matrix. The 'skip' commands in read.table, seem to be just for skipping rows, if I understand well, but I need to skip many columns as well. I tried something very simple and left one row and one column out, but I don't know how to do it for many rows and columns at a time. Example: zzz # euclidean distances matrix X1 X2 X3 X4 X5 X10 11 12 1 15 X2 110 5 12 3 X3 125 0 8 4 X4 1 12 8 0 16 X5 153 4160 skip.data = zzz [-3, -3] # gives me the same matrix without row 3 and column 3... how can I tell R to skip more than one row and column? Columns and rows to be left out are not contiguous The matrix I'm using is fairly large (335x335), so excel won't read the whole thing, and it's not a fixed-width txt file, so I cannot use read.fwf. I also don't have access to Unix, as was suggested to other users to solve this problem. -- View this message in context: http://www.nabble.com/Skipping-rows-and-columns-in-large-matrix-tf3712332.html#a10384400 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function
Suggestion: You might make it easier for folks to help if you explained in clear and simple terms what you are trying to do. Code is hard to deconstruct. Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED) Sent: Tuesday, May 08, 2007 2:22 PM To: r-help@stat.math.ethz.ch Subject: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function I wrote an ugly algorithm to set certain elements of a matrix to 1 without looping and below works and you can see what The output is below the code. K-6 lagnum-2 restrictmat-matrix(0,nrow=K,ncol=K*3) restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) (col(restrictmat)-row(restrictmat)) %% K == 0)]-1 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0 restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000100 0 0 0 0 0 0 0 0 0 [2,]010000010 0 0 0 0 0 0 0 0 0 [3,]001000001 0 0 0 0 0 0 0 0 0 [4,]000100000 1 0 0 0 0 0 0 0 0 [5,]000010000 0 1 0 0 0 0 0 0 0 [6,]000001000 0 0 1 0 0 0 0 0 0 For lagnum equals 1 , it also works : restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000000 0 0 0 0 0 0 0 0 0 [2,]010000000 0 0 0 0 0 0 0 0 0 [3,]001000000 0 0 0 0 0 0 0 0 0 [4,]000100000 0 0 0 0 0 0 0 0 0 [5,]000010000 0 0 0 0 0 0 0 0 0 [6,]000001000 0 0 0 0 0 0 0 0 0 But I am thinking that there has to be a better way particularly because I'll get an error if I set lagnum to 3. Any improvements or total revampings are appreciated. The number of columns will always be a multiple of the number of rows So K doesn't have to be 6. that was just to show what the commands do. thanks. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mantel-Haenszel relative risk with Greenland-Robins variance estimate
Does anyone know of an R function for computing the Greenland-Robins variance for Mantel-Haenszel relative risks? Thanks Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] to draw a smooth arc
Prof. Ripley, Maybe the fact that few R plot regions have a 1:1 aspect ratio is not a major problem here. One has to deal with the same issue when drawing a circle parametrically. Depending on the window size, a (cos(t),sin(t)) plot appears as an ellipse. To get a circle parametrically, one has to resize the plot region, or define it (possibly by trial and error) with functions like windows() or win.graph() (package grDevices), to have a 1:1 aspect ratio. Maybe more important is to have some sure way to draw smooth arcs and arrows. Could it be done even if the plot region does not have a 1:1 aspect ratio, when the arc (intended to be an arc of a circle) would show as an arc of an ellipse? Would the smooth ellipse arc then turn into a smooth circle arc, when the plot region is resized? Paulo Barata (Rio de Janeiro - Brazil) --- Prof Brian Ripley wrote: There is now an xspline() function in R-devel, with an example showing how to add arrows. I thought a bit more about a 'circular arc' function, but there really is a problem with that. Few R plot regions have a 1:1 aspect ratio including some that are intended to do so (see the rw-FAQ). symbols() is designed to draw circles in device coordinates, but attempting to specify circular arcs by endpoints in user coordinates is fraught. On Wed, 2 May 2007, Paul Murrell wrote: Hi Paulo Barata wrote: Dr. Murrell and all, One final suggestion: a future function arc() in package graphics, with centre-radius-angle parameterisation, could also include an option to draw arrows at either end of the arc, as one can find in function arrows(). ... and in grid.xspline() and grid.curve(). Paul Thank you. Paulo Barata --- Paul Murrell wrote: Hi Paulo Barata wrote: Dr. Snow and Prof. Ripley, Dr. Snow's suggestion, using clipplot (package TeachingDemos), is maybe a partial solution to the problem of drawing an arc of a circle (as long as the line width of the arc is not that large, as pointed out by Prof. Ripley). If the arc is symmetrical around a vertical line, then it is not so difficult to draw it that way. But an arc that does not have this kind of symmetry would possibly require some geometrical computations to find the proper rectangle to be used for clipping. I would like to suggest that in a future version of R some function be included in the graphics package to draw smooth arcs with given center, radius, initial and final angles. I suppose that the basic ingredients are available in function symbols (graphics). Just to back up a few previous posts ... There is something like this facility already available via the grid.xspline() function in the grid package. This provides very flexible curve drawing (including curves very close to Bezier curves) based on the X-Splines implemented in xfig. The grid.curve() function provides a convenience layer that allows for at least certain parameterisations of arcs (you specify the arc end points and the angle). These functions are built on functionality within the core graphics engine, so exposing a similar interface (e.g., an xspline() function) within traditional graphics would be relatively straightforward. The core functionality draws the curves as line segments (but automatically figures out how many segments to use so that the curve looks smooth); it does NOT call curve-drawing primitives in the graphics device (like PostScript's curveto). In summary: there is some support for smooth curves, but we could still benefit from a specific arc() function with the standard centre-radius-angle parameterisation and we could also benefit from exposing the native strengths of different graphics devices (rather than the current lowest-common-denominator approach). Paul Thank you very much. Paulo Barata (Rio de Janeiro - Brazil) --- Prof Brian Ripley wrote: On Tue, 1 May 2007, Greg Snow wrote: Here is an approach that clips the circle you like from symbols down to an arc (this will work as long as the arc is less than half a circle, for arcs greater than half a circle, you could draw the whole circle then use this to draw an arc of the bacground color over the section you don't want): library(TeachingDemos) plot(-5:5, -5:5, type='n') clipplot( symbols(0,0,circles=2, add=TRUE), c(0,5), c(0,5) ) I had considered this approach: clipping a circle to a rectangle isn't strictly an arc, as will be clear if the line width is large. Consider clipplot(symbols(0, 0 ,circles=2, add=TRUE, lwd=5), c(-1,5), c(-1,5)) Note too that what happens with clipping is device-dependent. If R's internal clipping is used, the part-circle is converted to a polygon. __
Re: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function
That's a good idea : I didn't realize that my matrices would look so bad in the final email. All I want To do is output 1's in the diagonal elements and zero's everywhere else but the matrix is not square so by diagonals I Really mean if Lagnum = 1 then the elements are (1,1), (2,2), (3,3),(4,4),(5,5),(6,6) Lagnum = 2 then the elements (1,1), (2,2), (3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6) Lagnum = 3 then the elements (1,1), (2,2), (3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6),(13,1),(1 4,2),(15,3),(16,4),(17,5), (18,6) And lagnum always has to be greater than or equal to 1 and less than or equal to (number of cols/number of rows ). Thanks For your advice. -Original Message- From: Bert Gunter [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 08, 2007 5:34 PM To: Leeds, Mark (IED); r-help@stat.math.ethz.ch Subject: RE: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function Suggestion: You might make it easier for folks to help if you explained in clear and simple terms what you are trying to do. Code is hard to deconstruct. Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED) Sent: Tuesday, May 08, 2007 2:22 PM To: r-help@stat.math.ethz.ch Subject: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function I wrote an ugly algorithm to set certain elements of a matrix to 1 without looping and below works and you can see what The output is below the code. K-6 lagnum-2 restrictmat-matrix(0,nrow=K,ncol=K*3) restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) (col(restrictmat)-row(restrictmat)) %% K == 0)]-1 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0 restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000100 0 0 0 0 0 0 0 0 0 [2,]010000010 0 0 0 0 0 0 0 0 0 [3,]001000001 0 0 0 0 0 0 0 0 0 [4,]000100000 1 0 0 0 0 0 0 0 0 [5,]000010000 0 1 0 0 0 0 0 0 0 [6,]000001000 0 0 1 0 0 0 0 0 0 For lagnum equals 1 , it also works : restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000000 0 0 0 0 0 0 0 0 0 [2,]010000000 0 0 0 0 0 0 0 0 0 [3,]001000000 0 0 0 0 0 0 0 0 0 [4,]000100000 0 0 0 0 0 0 0 0 0 [5,]000010000 0 0 0 0 0 0 0 0 0 [6,]000001000 0 0 0 0 0 0 0 0 0 But I am thinking that there has to be a better way particularly because I'll get an error if I set lagnum to 3. Any improvements or total revampings are appreciated. The number of columns will always be a multiple of the number of rows So K doesn't have to be 6. that was just to show what the commands do. thanks. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Skipping rows and columns in large matrix
zzz - zzz[-c(1,45,23,321), -c(23,34,45,67)] On 5/8/07, Silvia Lomascolo [EMAIL PROTECTED] wrote: I work on Windows, R version 2.4.1 I need to leave out many rows and columns when reading a matrix. The 'skip' commands in read.table, seem to be just for skipping rows, if I understand well, but I need to skip many columns as well. I tried something very simple and left one row and one column out, but I don't know how to do it for many rows and columns at a time. Example: zzz # euclidean distances matrix X1 X2 X3 X4 X5 X10 11 12 1 15 X2 110 5 12 3 X3 125 0 8 4 X4 1 12 8 0 16 X5 153 4160 skip.data = zzz [-3, -3] # gives me the same matrix without row 3 and column 3... how can I tell R to skip more than one row and column? Columns and rows to be left out are not contiguous The matrix I'm using is fairly large (335x335), so excel won't read the whole thing, and it's not a fixed-width txt file, so I cannot use read.fwf. I also don't have access to Unix, as was suggested to other users to solve this problem. -- View this message in context: http://www.nabble.com/Skipping-rows-and-columns-in-large-matrix-tf3712332.html#a10384400 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a cleaner way to implement a setting certain indices of a matrix to 1 function
Hi Mark, Is this of any help? resMat-function(K=6,lag=2,ncol=3*K){ X-matrix(0,K,ncol) X[,1:(K*lag)]-diag(K) return(X) } Cheers, Anders. On Tuesday 08 May 2007 11:21 am, Leeds, Mark (IED) wrote: I wrote an ugly algorithm to set certain elements of a matrix to 1 without looping and below works and you can see what The output is below the code. K-6 lagnum-2 restrictmat-matrix(0,nrow=K,ncol=K*3) restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) (col(restrictmat)-row(restrictmat)) %% K == 0)]-1 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0 restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000100 0 0 0 0 0 0 0 0 0 [2,]010000010 0 0 0 0 0 0 0 0 0 [3,]001000001 0 0 0 0 0 0 0 0 0 [4,]000100000 1 0 0 0 0 0 0 0 0 [5,]000010000 0 1 0 0 0 0 0 0 0 [6,]000001000 0 0 1 0 0 0 0 0 0 For lagnum equals 1 , it also works : restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000000 0 0 0 0 0 0 0 0 0 [2,]010000000 0 0 0 0 0 0 0 0 0 [3,]001000000 0 0 0 0 0 0 0 0 0 [4,]000100000 0 0 0 0 0 0 0 0 0 [5,]000010000 0 0 0 0 0 0 0 0 0 [6,]000001000 0 0 0 0 0 0 0 0 0 But I am thinking that there has to be a better way particularly because I'll get an error if I set lagnum to 3. Any improvements or total revampings are appreciated. The number of columns will always be a multiple of the number of rows So K doesn't have to be 6. that was just to show what the commands do. thanks. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function
Is this what you want? n - 6 lagnum - 2 result - NULL for (i in 1:lagnum) result - cbind(result, diag(n)) result [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [1,]100000100 0 0 0 [2,]010000010 0 0 0 [3,]001000001 0 0 0 [4,]000100000 1 0 0 [5,]000010000 0 1 0 [6,]000001000 0 0 1 On 5/8/07, Leeds, Mark (IED) [EMAIL PROTECTED] wrote: That's a good idea : I didn't realize that my matrices would look so bad in the final email. All I want To do is output 1's in the diagonal elements and zero's everywhere else but the matrix is not square so by diagonals I Really mean if Lagnum = 1 then the elements are (1,1), (2,2), (3,3),(4,4),(5,5),(6,6) Lagnum = 2 then the elements (1,1), (2,2), (3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6) Lagnum = 3 then the elements (1,1), (2,2), (3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6),(13,1),(1 4,2),(15,3),(16,4),(17,5), (18,6) And lagnum always has to be greater than or equal to 1 and less than or equal to (number of cols/number of rows ). Thanks For your advice. -Original Message- From: Bert Gunter [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 08, 2007 5:34 PM To: Leeds, Mark (IED); r-help@stat.math.ethz.ch Subject: RE: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function Suggestion: You might make it easier for folks to help if you explained in clear and simple terms what you are trying to do. Code is hard to deconstruct. Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED) Sent: Tuesday, May 08, 2007 2:22 PM To: r-help@stat.math.ethz.ch Subject: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function I wrote an ugly algorithm to set certain elements of a matrix to 1 without looping and below works and you can see what The output is below the code. K-6 lagnum-2 restrictmat-matrix(0,nrow=K,ncol=K*3) restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) (col(restrictmat)-row(restrictmat)) %% K == 0)]-1 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0 restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000100 0 0 0 0 0 0 0 0 0 [2,]010000010 0 0 0 0 0 0 0 0 0 [3,]001000001 0 0 0 0 0 0 0 0 0 [4,]000100000 1 0 0 0 0 0 0 0 0 [5,]000010000 0 1 0 0 0 0 0 0 0 [6,]000001000 0 0 1 0 0 0 0 0 0 For lagnum equals 1 , it also works : restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000000 0 0 0 0 0 0 0 0 0 [2,]010000000 0 0 0 0 0 0 0 0 0 [3,]001000000 0 0 0 0 0 0 0 0 0 [4,]000100000 0 0 0 0 0 0 0 0 0 [5,]000010000 0 0 0 0 0 0 0 0 0 [6,]000001000 0 0 0 0 0 0 0 0 0 But I am thinking that there has to be a better way particularly because I'll get an error if I set lagnum to 3. Any improvements or total revampings are appreciated. The number of columns will always be a multiple of the number of rows So K doesn't have to be 6. that was just to show what the commands do. thanks. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list
Re: [R] Mantel-Haenszel relative risk with Greenland-Robins variance estimate
Would this function help: http://www.csm.ornl.gov/~frome/ES/RRMHex/MHanalysis.txt ? Regards, -Cody Frank E Harrell Jr [EMAIL PROTECTED] To bilt.edu R list r-help@stat.math.ethz.ch Sent by: cc [EMAIL PROTECTED] at.math.ethz.ch Subject [R] Mantel-Haenszel relative risk with Greenland-Robins variance 05/08/2007 02:38 estimate PM Does anyone know of an R function for computing the Greenland-Robins variance for Mantel-Haenszel relative risks? Thanks Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting a point graph with data in X-axis
Dear all, I have two data frame, on with a complete list of my field survey with frequency data of a sample species. This data frame looks like: simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12))) simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=) simulation$freq-sample(1:40,32) attach(simulation) plot(year.month, freq) As you can see, I have a collumn with the year and month of my samples, and a freq variable with simulated data. I would like to plot this data but when I try to use the plot showed above, I get a error message. After bypass this problem, I would like add points in my graph with simulated data for only a random number of survey month, but I need that the full range of surveys be kept on the X-axis. Just to simulate a sample I am using: simulation.sample-simulation[sample(1:length(year.month),8, replace=F),] simulation.sample$freq-sample(1:40,8) Any ideas? Kind regards Miltinho __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a cleaner way to implement a setting certain indices of a matrix to 1 function
thanks anders : that works perfectly. I'll have to study it better to understand but it's much appreciated. -Original Message- From: Anders Nielsen [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 08, 2007 5:55 PM To: Leeds, Mark (IED) Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Looking for a cleaner way to implement a setting certain indices of a matrix to 1 function Hi Mark, Is this of any help? resMat-function(K=6,lag=2,ncol=3*K){ X-matrix(0,K,ncol) X[,1:(K*lag)]-diag(K) return(X) } Cheers, Anders. On Tuesday 08 May 2007 11:21 am, Leeds, Mark (IED) wrote: I wrote an ugly algorithm to set certain elements of a matrix to 1 without looping and below works and you can see what The output is below the code. K-6 lagnum-2 restrictmat-matrix(0,nrow=K,ncol=K*3) restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) (col(restrictmat)-row(restrictmat)) %% K == 0)]-1 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0 restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000100 0 0 0 0 0 0 0 0 0 [2,]010000010 0 0 0 0 0 0 0 0 0 [3,]001000001 0 0 0 0 0 0 0 0 0 [4,]000100000 1 0 0 0 0 0 0 0 0 [5,]000010000 0 1 0 0 0 0 0 0 0 [6,]000001000 0 0 1 0 0 0 0 0 0 For lagnum equals 1 , it also works : restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000000 0 0 0 0 0 0 0 0 0 [2,]010000000 0 0 0 0 0 0 0 0 0 [3,]001000000 0 0 0 0 0 0 0 0 0 [4,]000100000 0 0 0 0 0 0 0 0 0 [5,]000010000 0 0 0 0 0 0 0 0 0 [6,]000001000 0 0 0 0 0 0 0 0 0 But I am thinking that there has to be a better way particularly because I'll get an error if I set lagnum to 3. Any improvements or total revampings are appreciated. The number of columns will always be a multiple of the number of rows So K doesn't have to be 6. that was just to show what the commands do. thanks. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mantel-Haenszel relative risk with Greenland-Robins variance estimate
[EMAIL PROTECTED] wrote: Would this function help: http://www.csm.ornl.gov/~frome/ES/RRMHex/MHanalysis.txt ? Regards, -Cody I think so. Thank you Cody. If you have time would you mind defining, probably offline, the input data elements? I assume that one of them is a stratification factor other than occupational group. Thanks again Frank Frank E Harrell Jr [EMAIL PROTECTED] To bilt.edu R list r-help@stat.math.ethz.ch Sent by: cc [EMAIL PROTECTED] at.math.ethz.ch Subject [R] Mantel-Haenszel relative risk with Greenland-Robins variance 05/08/2007 02:38 estimate PM Does anyone know of an R function for computing the Greenland-Robins variance for Mantel-Haenszel relative risks? Thanks Frank -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Skipping rows and columns in large matrix
yes, this works. It didn't think of concatenating... thanks! jim holtman wrote: zzz - zzz[-c(1,45,23,321), -c(23,34,45,67)] On 5/8/07, Silvia Lomascolo [EMAIL PROTECTED] wrote: I work on Windows, R version 2.4.1 I need to leave out many rows and columns when reading a matrix. The 'skip' commands in read.table, seem to be just for skipping rows, if I understand well, but I need to skip many columns as well. I tried something very simple and left one row and one column out, but I don't know how to do it for many rows and columns at a time. Example: zzz # euclidean distances matrix X1 X2 X3 X4 X5 X10 11 12 1 15 X2 110 5 12 3 X3 125 0 8 4 X4 1 12 8 0 16 X5 153 4160 skip.data = zzz [-3, -3] # gives me the same matrix without row 3 and column 3... how can I tell R to skip more than one row and column? Columns and rows to be left out are not contiguous The matrix I'm using is fairly large (335x335), so excel won't read the whole thing, and it's not a fixed-width txt file, so I cannot use read.fwf. I also don't have access to Unix, as was suggested to other users to solve this problem. -- View this message in context: http://www.nabble.com/Skipping-rows-and-columns-in-large-matrix-tf3712332.html#a10384400 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Skipping-rows-and-columns-in-large-matrix-tf3712332.html#a10385469 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] draw two plots on a single panel
Hi, I have 2 dataset, plot(data1) plot(data2), but it comes as two graphs, can I draw both on a single panel so I can compare them? Thanks Pat __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] draw two plots on a single panel
See ?points and ?lines for a simple solution. Eg: plot(data1) points(data2) if mean this by 'panel'. Gabor On Tue, May 08, 2007 at 04:49:47PM -0700, Patrick Wang wrote: Hi, I have 2 dataset, plot(data1) plot(data2), but it comes as two graphs, can I draw both on a single panel so I can compare them? Thanks Pat __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting a point graph with data in X-axis
You have to have a valid 'date' object on the x-axis. Try this: simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12))) simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=) simulation$freq-sample(1:40,32) # create POSIXct time simulation$time - ISOdate(simulation$my.year, simulation$my.month,1) attach(simulation) plot(time, freq) On 5/8/07, Milton Cezar Ribeiro [EMAIL PROTECTED] wrote: Dear all, I have two data frame, on with a complete list of my field survey with frequency data of a sample species. This data frame looks like: simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12))) simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=) simulation$freq-sample(1:40,32) attach(simulation) plot(year.month, freq) As you can see, I have a collumn with the year and month of my samples, and a freq variable with simulated data. I would like to plot this data but when I try to use the plot showed above, I get a error message. After bypass this problem, I would like add points in my graph with simulated data for only a random number of survey month, but I need that the full range of surveys be kept on the X-axis. Just to simulate a sample I am using: simulation.sample-simulation[sample(1:length(year.month),8, replace=F),] simulation.sample$freq-sample(1:40,8) Any ideas? Kind regards Miltinho __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weighted least squares
http://en.wikipedia.org/wiki/Weighted_least_squares gives a formulaic description of what you have said. I believe the original poster has converted something like this y x 0 1.1 0 2.2 0 2.2 0 2.2 1 3.3 1 3.3 2 4.4 ... into something like the following y x freq 0 1.11 0 2.23 1 3.32 2 4.41 ... Now, the variance of means of each row in table above is ZERO because the individual elements that comprise each row are identical. Therefore your method of using inverse-variance will not work here. Then is it valid then to use lm( y ~ x, weights=freq ) ? Regards, Adai S Ellison wrote: Hadley, You asked .. what is the usual way to do a linear regression when you have aggregated data? Least squares generally uses inverse variance weighting. For aggregated data fitted as mean values, you just need the variances for the _means_. So if you have individual means x_i and sd's s_i that arise from aggregated data with n_i observations in group i, the natural weighting is by inverse squared standard error of the mean. The appropriate weight for x_i would then be n_i/(s_i^2). In R, that's n/(s^2), as n and s would be vectors with the same length as x. If all the groups had the same variance, or nearly so, s is a scalar; if they have the same number of observations, n is a scalar. Of course, if they have the same variance and same number of observations, they all have the same weight and you needn't weight them at all: see previous posting! Steve E *** This email and any attachments are confidential. Any use, co...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting a point graph with data in X-axis
- use proper spacing to make it easier to read - start off with set.seed to make it reproducible - omit cbind and combine all the rep's into one rep in first line - make the date column a known date class (here Date), set.seed(1) sim - data.frame( my.year = rep(2000:2002, c(8, 12, 12)), my.month = c(5:12, 1:12, 1:12), freq = sample(1:40, 32) ) sim$date - as.Date(paste(sim$my.year, sim$my.month, 1, sep = -)) plot(freq ~ date, sim) On 5/8/07, Milton Cezar Ribeiro [EMAIL PROTECTED] wrote: Dear all, I have two data frame, on with a complete list of my field survey with frequency data of a sample species. This data frame looks like: simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12))) simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=) simulation$freq-sample(1:40,32) attach(simulation) plot(year.month, freq) As you can see, I have a collumn with the year and month of my samples, and a freq variable with simulated data. I would like to plot this data but when I try to use the plot showed above, I get a error message. After bypass this problem, I would like add points in my graph with simulated data for only a random number of survey month, but I need that the full range of surveys be kept on the X-axis. Just to simulate a sample I am using: simulation.sample-simulation[sample(1:length(year.month),8, replace=F),] simulation.sample$freq-sample(1:40,8) Any ideas? Kind regards Miltinho __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Skipping rows and columns in large matrix
Columns that correspond to NULLs in the colClasses arg of read.table are ignored. ?read.table On 5/8/07, Silvia Lomascolo [EMAIL PROTECTED] wrote: I work on Windows, R version 2.4.1 I need to leave out many rows and columns when reading a matrix. The 'skip' commands in read.table, seem to be just for skipping rows, if I understand well, but I need to skip many columns as well. I tried something very simple and left one row and one column out, but I don't know how to do it for many rows and columns at a time. Example: zzz # euclidean distances matrix X1 X2 X3 X4 X5 X10 11 12 1 15 X2 110 5 12 3 X3 125 0 8 4 X4 1 12 8 0 16 X5 153 4160 skip.data = zzz [-3, -3] # gives me the same matrix without row 3 and column 3... how can I tell R to skip more than one row and column? Columns and rows to be left out are not contiguous The matrix I'm using is fairly large (335x335), so excel won't read the whole thing, and it's not a fixed-width txt file, so I cannot use read.fwf. I also don't have access to Unix, as was suggested to other users to solve this problem. -- View this message in context: http://www.nabble.com/Skipping-rows-and-columns-in-large-matrix-tf3712332.html#a10384400 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting a point graph with data in X-axis
R understands only numerical and Date class values for axis. So either a) plot them using the sequence 1, ..., 32 and then explicitly label them. Here is an example: n - length(year.month) plot( 1:n, freq, xaxt=n) mtext( text=year.month, side=1, at=1:n, las=2 ) b) or create the dates in Date format. This option is preferable if the dates were varying unequally. x - seq( as.Date(2000-05-01), as.Date(2002-12-01), by=1 month ) plot(x, simulation$freq) BTW, you could also have created year.month via paste( rep( 2000:2002, c(8,12,12) ), formatC( c(5:12,1:12,1:12), width=2, flag=0 ) , sep=_ ) Regards, Adai Milton Cezar Ribeiro wrote: Dear all, I have two data frame, on with a complete list of my field survey with frequency data of a sample species. This data frame looks like: simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12))) simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=) simulation$freq-sample(1:40,32) attach(simulation) plot(year.month, freq) As you can see, I have a collumn with the year and month of my samples, and a freq variable with simulated data. I would like to plot this data but when I try to use the plot showed above, I get a error message. After bypass this problem, I would like add points in my graph with simulated data for only a random number of survey month, but I need that the full range of surveys be kept on the X-axis. Just to simulate a sample I am using: simulation.sample-simulation[sample(1:length(year.month),8, replace=F),] simulation.sample$freq-sample(1:40,8) Any ideas? Kind regards Miltinho __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function
Try this: kronecker(t(rep(1:0, c(lagnum, 3-lagnum))), diag(K)) On 5/8/07, Leeds, Mark (IED) [EMAIL PROTECTED] wrote: That's a good idea : I didn't realize that my matrices would look so bad in the final email. All I want To do is output 1's in the diagonal elements and zero's everywhere else but the matrix is not square so by diagonals I Really mean if Lagnum = 1 then the elements are (1,1), (2,2), (3,3),(4,4),(5,5),(6,6) Lagnum = 2 then the elements (1,1), (2,2), (3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6) Lagnum = 3 then the elements (1,1), (2,2), (3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6),(13,1),(1 4,2),(15,3),(16,4),(17,5), (18,6) And lagnum always has to be greater than or equal to 1 and less than or equal to (number of cols/number of rows ). Thanks For your advice. -Original Message- From: Bert Gunter [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 08, 2007 5:34 PM To: Leeds, Mark (IED); r-help@stat.math.ethz.ch Subject: RE: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function Suggestion: You might make it easier for folks to help if you explained in clear and simple terms what you are trying to do. Code is hard to deconstruct. Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED) Sent: Tuesday, May 08, 2007 2:22 PM To: r-help@stat.math.ethz.ch Subject: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function I wrote an ugly algorithm to set certain elements of a matrix to 1 without looping and below works and you can see what The output is below the code. K-6 lagnum-2 restrictmat-matrix(0,nrow=K,ncol=K*3) restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) (col(restrictmat)-row(restrictmat)) %% K == 0)]-1 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0 restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000100 0 0 0 0 0 0 0 0 0 [2,]010000010 0 0 0 0 0 0 0 0 0 [3,]001000001 0 0 0 0 0 0 0 0 0 [4,]000100000 1 0 0 0 0 0 0 0 0 [5,]000010000 0 1 0 0 0 0 0 0 0 [6,]000001000 0 0 1 0 0 0 0 0 0 For lagnum equals 1 , it also works : restrictmat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [1,]100000000 0 0 0 0 0 0 0 0 0 [2,]010000000 0 0 0 0 0 0 0 0 0 [3,]001000000 0 0 0 0 0 0 0 0 0 [4,]000100000 0 0 0 0 0 0 0 0 0 [5,]000010000 0 0 0 0 0 0 0 0 0 [6,]000001000 0 0 0 0 0 0 0 0 0 But I am thinking that there has to be a better way particularly because I'll get an error if I set lagnum to 3. Any improvements or total revampings are appreciated. The number of columns will always be a multiple of the number of rows So K doesn't have to be 6. that was just to show what the commands do. thanks. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting a point graph with data in X-axis
One other idea. If your dates are continguous, as they are here, you might want to use a ts series for this. Using the same sim as in my prior post: set.seed(1) x - with(sim, ts(freq, start = c(my.year[1], my.month[1]), freq = 12)) x Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2000 11 15 22 34 8 32 33 38 2001 21 2 7 6 20 40 35 13 18 23 9 17 2002 19 5 12 3 28 29 1 16 27 4 25 39 plot(x) On 5/8/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: - use proper spacing to make it easier to read - start off with set.seed to make it reproducible - omit cbind and combine all the rep's into one rep in first line - make the date column a known date class (here Date), set.seed(1) sim - data.frame( my.year = rep(2000:2002, c(8, 12, 12)), my.month = c(5:12, 1:12, 1:12), freq = sample(1:40, 32) ) sim$date - as.Date(paste(sim$my.year, sim$my.month, 1, sep = -)) plot(freq ~ date, sim) On 5/8/07, Milton Cezar Ribeiro [EMAIL PROTECTED] wrote: Dear all, I have two data frame, on with a complete list of my field survey with frequency data of a sample species. This data frame looks like: simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12))) simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=) simulation$freq-sample(1:40,32) attach(simulation) plot(year.month, freq) As you can see, I have a collumn with the year and month of my samples, and a freq variable with simulated data. I would like to plot this data but when I try to use the plot showed above, I get a error message. After bypass this problem, I would like add points in my graph with simulated data for only a random number of survey month, but I need that the full range of surveys be kept on the X-axis. Just to simulate a sample I am using: simulation.sample-simulation[sample(1:length(year.month),8, replace=F),] simulation.sample$freq-sample(1:40,8) Any ideas? Kind regards Miltinho __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Separate x-axis ticks for panels in xyplot()?
This will get you started. Rich tmp - data.frame(y=rnorm(20), x=rnorm(20), c=factor(rep(letters[1:5], 4)), g=factor(rep(1:2, each=10))) v1 - seq(-1.5, 0, .5) v2 - 0:2 xyplot(y ~ x | c, groups = g, data=tmp, scales = list(x = list( relation=sliced, at = rep(list(v1, v2), length=5), labels = rep(list( c('A', 'B', 'C', 'D'), c('E', 'F', 'G')), length=5), rot = 45))) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] voronoi.mosaic chokes?
Hi all, I am running R 2.5.0 under Windows XP Media Center Edition. Here's a problem that's been stumping me for a few days now, and I can't find anything useful in the archives. I am using voronoi.mosaic (tripack package) to create proximity polygons for a study of vegetation competition and dynamics. The points lists are read in from a file for each plot, then 8 duplicates are translated around the edges of the plot (Toroidal edge correction). This is completed using the torus(...) function that I wrote myself. VMuncorrected is the voronoi mosaic that is not toroidally edge corrected VMcorrected is the voronoi mosaic that is toroidally edge corrected treemap = read.table('af1.txt', sep = \t, header = 1) VMuncorrected = voronoi.mosaic(treemap$X, treemap$Y) ###Use the torus function to create 8 copies around the study region toroid = torus(treemap$X, treemap$Y, 25, 25) VMcorrected = voronoi.mosaic(toroid[,1], toroid[,2], duplicate = remove) Here's the problem. When I read in the points for the data file listed above ('af1.txt'), both calls to voronoi.mosaic() work fine. (The second one takes about 1.5 seconds because there are 1147 points in the toroidally corrected set). However, when I read in the points from the next file ('af2.txt'), the first call to voronoi.mosaic() works. The next call (to torus()) also works fine. But the second call to voronoi.mosaic() causes R to freeze completely requiring Ctrl-Alt-Del. I have 10 sets of points and this problem happens for about 5 of them. Factors I have ruled out: -too many points in the call (one set had 1147 and worked fine while the next set had 801 and froze R) -duplicate points (taken care of by voronoi.mosaic(..., duplicate = remove) and also independently verified by exporting the data. no duplicates in either the original or the toroidally corrected set) -points too close together in space (minimum distance between two points in 'af1.txt' is 0.1414 and works fine. minimum distance in the second set, 'af2.txt', is 0.2236, and this set causes R to freeze) -not enough memory (each data set is run in a new R session-i.e. R was quit between each attempt) -'flukiness' (the problem happens the same way every time for the problem data sets, and the code runs fine every time for the non-problem data sets) -file formats (each text file has the same number of columns, all the labels for the columns are identical, and the columns are always in the same order) -outdated versions (I am using R 2.5.0 and updated the tripack package within the last week. also, I update packages about once a month) This is a very frustrating problem because I get no errors indicating any problem with the data, and I have checked the data over and over for any type of error and found none. If anyone has ANY helpful suggestions, I would love to hear about any and all of them. Andrew p.s. - for those of you who are really intrigued, I can email you the .txt files and the code for the torus() function. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] voronoi.mosaic chokes?
Have you tried 'debug'? With one of the datasets that crashes your system, run the first voronoi.mosaic, then the torus function and then debug(voronoi.mosaic) and run through the second call of voronoi.mosaic. You can step though the code and at least will you find the point where it bombs. -Christos Christos Hatzis, Ph.D. Nuvera Biosciences, Inc. 400 West Cummings Park Suite 5350 Woburn, MA 01801 Tel: 781-938-3830 www.nuverabio.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Pierce Sent: Wednesday, May 09, 2007 12:21 AM To: r-help@stat.math.ethz.ch Subject: [R] voronoi.mosaic chokes? Hi all, I am running R 2.5.0 under Windows XP Media Center Edition. Here's a problem that's been stumping me for a few days now, and I can't find anything useful in the archives. I am using voronoi.mosaic (tripack package) to create proximity polygons for a study of vegetation competition and dynamics. The points lists are read in from a file for each plot, then 8 duplicates are translated around the edges of the plot (Toroidal edge correction). This is completed using the torus(...) function that I wrote myself. VMuncorrected is the voronoi mosaic that is not toroidally edge corrected VMcorrected is the voronoi mosaic that is toroidally edge corrected treemap = read.table('af1.txt', sep = \t, header = 1) VMuncorrected = voronoi.mosaic(treemap$X, treemap$Y) ###Use the torus function to create 8 copies around the study region toroid = torus(treemap$X, treemap$Y, 25, 25) VMcorrected = voronoi.mosaic(toroid[,1], toroid[,2], duplicate = remove) Here's the problem. When I read in the points for the data file listed above ('af1.txt'), both calls to voronoi.mosaic() work fine. (The second one takes about 1.5 seconds because there are 1147 points in the toroidally corrected set). However, when I read in the points from the next file ('af2.txt'), the first call to voronoi.mosaic() works. The next call (to torus()) also works fine. But the second call to voronoi.mosaic() causes R to freeze completely requiring Ctrl-Alt-Del. I have 10 sets of points and this problem happens for about 5 of them. Factors I have ruled out: -too many points in the call (one set had 1147 and worked fine while the next set had 801 and froze R) -duplicate points (taken care of by voronoi.mosaic(..., duplicate = remove) and also independently verified by exporting the data. no duplicates in either the original or the toroidally corrected set) -points too close together in space (minimum distance between two points in 'af1.txt' is 0.1414 and works fine. minimum distance in the second set, 'af2.txt', is 0.2236, and this set causes R to freeze) -not enough memory (each data set is run in a new R session-i.e. R was quit between each attempt) -'flukiness' (the problem happens the same way every time for the problem data sets, and the code runs fine every time for the non-problem data sets) -file formats (each text file has the same number of columns, all the labels for the columns are identical, and the columns are always in the same order) -outdated versions (I am using R 2.5.0 and updated the tripack package within the last week. also, I update packages about once a month) This is a very frustrating problem because I get no errors indicating any problem with the data, and I have checked the data over and over for any type of error and found none. If anyone has ANY helpful suggestions, I would love to hear about any and all of them. Andrew p.s. - for those of you who are really intrigued, I can email you the .txt files and the code for the torus() function. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.