Re: [R] Running R function as a Batch process
Hi, There are many clues in the help. First I created the file c:\sumfunction.R x-as.numeric(commandArgs()[-1:-4] ) print(x) addtogether-function(x,y){SUM-x+y;print(SUM)} addtogether(x[1],x[2]) Then at the command line in Windows I enter R --vanilla --slave --args 7 10 c:\sumfunction.R c:\logout.txt Regards Alex -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of d. sarthi maheshwari Sent: May 16, 2007 8:29 AM To: r-help@stat.math.ethz.ch Subject: [R] Running R function as a Batch process Hi, I am struggling with using R CMD BATCH command. Kindly suggest solution to the following problem. I have a function named CinC with accept two input parameters. This can be shown as: CinC - function(start, end) where start and end both are character strings. Please suggest me how can I run this function using command R CMD BATCH. Currently I am trying to do like this - R CMD BATCH c:/tt/CinC.r c:/tt/h.Rout -20070416 08:41 -20070416 10:33 What is wrong/incorrect with it? Your suggestions are important to me. Kindly reply. Thanks in advance. Regards Divya Sarthi [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error using newdata argument in survfit
Hi Brian, The factor Prior.f has 5 levels (1,2,3,4,5) which coxph deals with by creating 4 dummy variables coded with 1 or zero. That's what I see when I look at fit$x. fit$x[1:5,] Week LagAOO factor(Prior.f)2 factor(Prior.f)3 factor(Prior.f)4 31 22 0000 32 22 0000 33 22 2000 34 22 3000 35 22 2000 factor(Prior.f)5 310 320 330 340 350 I have played with the formula a bit adding a term at a time and then checking to see if I can produce the survival curves for pseudo cohorts. I get as far as the Prior.f term and am successful if I treat it as a continuous variable. If I introduce it as a factor and assume it wants four dummy variables as above I get the variable lengths error. If I represent the term with one variable: survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=c(1,2))) I get: Error in x2 %*% coef : non-conformable arguments Which is a nice change but still short of knowing what is going on. Regards Alex -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: June 15, 2005 4:24 PM To: Hanke, Alex Cc: 'r-help@stat.math.ethz.ch' Subject: Re: [R] Error using newdata argument in survfit You appear to have a coding for prior.f in newdata rather than the factor itself. It's a bit hard to be sure when we don't have data8 to compare with. On Wed, 15 Jun 2005, Hanke, Alex wrote: Dear R-helpers, To get curves for a pseudo cohort other than the one centered at the mean of the covariates, I have been trying to use the newdata argument to survfit with no success. Here is my model statement, the newdata and the ensuing error. What am I doing wrong? summary(fit) Call: coxph(formula = Surv(Start, Stop, Event, type = counting) ~ Week + LagAOO + Prior.f + cluster(interaction(Station, Year)), data = data8, method = breslow, x = T, y = T) n= 1878 coef exp(coef) se(coef) robust se z p Week 0.00582 1.01 0.03230.0239 0.244 8.1e-01 LagAOO 0.71929 2.05 0.12380.1215 5.918 3.3e-09 Prior.f2 0.12927 1.14 0.44020.4025 0.321 7.5e-01 Prior.f3 0.79082 2.21 0.54840.4460 1.773 7.6e-02 Prior.f4 2.04189 7.71 0.60080.4685 4.358 1.3e-05 Prior.f5 1.20450 3.34 0.64230.5481 2.198 2.8e-02 exp(coef) exp(-coef) lower .95 upper .95 Week 1.01 0.994 0.960 1.05 LagAOO2.05 0.487 1.618 2.61 Prior.f2 1.14 0.879 0.517 2.50 Prior.f3 2.21 0.453 0.920 5.29 Prior.f4 7.71 0.130 3.076 19.30 Prior.f5 3.34 0.300 1.139 9.76 Rsquare= 0.047 (max possible= 0.25 ) Likelihood ratio test= 91 on 6 df, p=0 Wald test= 209 on 6 df, p=0 Score (logrank) test = 142 on 6 df, p=0, Robust = 17.4 p=0.00803 (Note: the likelihood ratio and score tests assume independence of observations within a cluster, the Wald and robust score tests do not). newdat Week LagAOO Prior.f2 Prior.f3 Prior.f4 Prior.f5 1 17.55218 1.1916931000 2 17.55218 1.1916930000 survfit(fit,newdata=newdat) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ In addition: Warning message: 'newdata' had 2 rows but variable(s) found have 1878 rows Regards, Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Error using newdata argument in survfit
Thanks to Thomas, Brain and Ales, Whose advice led me to the solution. I actually had a second problem preventing Thomas' solution : survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=factor(c(1,2),levels=1: 5))) from working. In the model statement I create a factor from Prior.f via factor(Prior.f). Rather one should predefine the factor variable Prior.f-factor(Prior.f) and use that term in the model and then Thomas' solution works fine. Alex -Original Message- From: Thomas Lumley [mailto:[EMAIL PROTECTED] Sent: June 16, 2005 11:00 AM To: Hanke, Alex Cc: 'r-help@stat.math.ethz.ch' Subject: Re: [R] Error using newdata argument in survfit On Thu, 16 Jun 2005, Hanke, Alex wrote: Hi Brian, The factor Prior.f has 5 levels (1,2,3,4,5) which coxph deals with by creating 4 dummy variables coded with 1 or zero. That's what I see when I look at fit$x. fit$x[1:5,] Week LagAOO factor(Prior.f)2 factor(Prior.f)3 factor(Prior.f)4 31 22 0000 32 22 0000 33 22 2000 34 22 3000 35 22 2000 factor(Prior.f)5 310 320 330 340 350 I have played with the formula a bit adding a term at a time and then checking to see if I can produce the survival curves for pseudo cohorts. I get as far as the Prior.f term and am successful if I treat it as a continuous variable. If I introduce it as a factor and assume it wants four dummy variables as above I get the variable lengths error. If I represent the term with one variable: survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=c(1,2))) I get: Error in x2 %*% coef : non-conformable arguments Yes, but it wants a factor with *5* levels. Try survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=factor(c(1,2),levels=1: 5))) -thomas Which is a nice change but still short of knowing what is going on. Regards Alex -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: June 15, 2005 4:24 PM To: Hanke, Alex Cc: 'r-help@stat.math.ethz.ch' Subject: Re: [R] Error using newdata argument in survfit You appear to have a coding for prior.f in newdata rather than the factor itself. It's a bit hard to be sure when we don't have data8 to compare with. On Wed, 15 Jun 2005, Hanke, Alex wrote: Dear R-helpers, To get curves for a pseudo cohort other than the one centered at the mean of the covariates, I have been trying to use the newdata argument to survfit with no success. Here is my model statement, the newdata and the ensuing error. What am I doing wrong? summary(fit) Call: coxph(formula = Surv(Start, Stop, Event, type = counting) ~ Week + LagAOO + Prior.f + cluster(interaction(Station, Year)), data = data8, method = breslow, x = T, y = T) n= 1878 coef exp(coef) se(coef) robust se z p Week 0.00582 1.01 0.03230.0239 0.244 8.1e-01 LagAOO 0.71929 2.05 0.12380.1215 5.918 3.3e-09 Prior.f2 0.12927 1.14 0.44020.4025 0.321 7.5e-01 Prior.f3 0.79082 2.21 0.54840.4460 1.773 7.6e-02 Prior.f4 2.04189 7.71 0.60080.4685 4.358 1.3e-05 Prior.f5 1.20450 3.34 0.64230.5481 2.198 2.8e-02 exp(coef) exp(-coef) lower .95 upper .95 Week 1.01 0.994 0.960 1.05 LagAOO2.05 0.487 1.618 2.61 Prior.f2 1.14 0.879 0.517 2.50 Prior.f3 2.21 0.453 0.920 5.29 Prior.f4 7.71 0.130 3.076 19.30 Prior.f5 3.34 0.300 1.139 9.76 Rsquare= 0.047 (max possible= 0.25 ) Likelihood ratio test= 91 on 6 df, p=0 Wald test= 209 on 6 df, p=0 Score (logrank) test = 142 on 6 df, p=0, Robust = 17.4 p=0.00803 (Note: the likelihood ratio and score tests assume independence of observations within a cluster, the Wald and robust score tests do not). newdat Week LagAOO Prior.f2 Prior.f3 Prior.f4 Prior.f5 1 17.55218 1.1916931000 2 17.55218 1.1916930000 survfit(fit,newdata=newdat) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ In addition: Warning message: 'newdata' had 2 rows but variable(s) found have 1878 rows Regards, Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R
[R] Error using newdata argument in survfit
Dear R-helpers, To get curves for a pseudo cohort other than the one centered at the mean of the covariates, I have been trying to use the newdata argument to survfit with no success. Here is my model statement, the newdata and the ensuing error. What am I doing wrong? summary(fit) Call: coxph(formula = Surv(Start, Stop, Event, type = counting) ~ Week + LagAOO + Prior.f + cluster(interaction(Station, Year)), data = data8, method = breslow, x = T, y = T) n= 1878 coef exp(coef) se(coef) robust se z p Week 0.00582 1.01 0.03230.0239 0.244 8.1e-01 LagAOO 0.71929 2.05 0.12380.1215 5.918 3.3e-09 Prior.f2 0.12927 1.14 0.44020.4025 0.321 7.5e-01 Prior.f3 0.79082 2.21 0.54840.4460 1.773 7.6e-02 Prior.f4 2.04189 7.71 0.60080.4685 4.358 1.3e-05 Prior.f5 1.20450 3.34 0.64230.5481 2.198 2.8e-02 exp(coef) exp(-coef) lower .95 upper .95 Week 1.01 0.994 0.960 1.05 LagAOO2.05 0.487 1.618 2.61 Prior.f2 1.14 0.879 0.517 2.50 Prior.f3 2.21 0.453 0.920 5.29 Prior.f4 7.71 0.130 3.076 19.30 Prior.f5 3.34 0.300 1.139 9.76 Rsquare= 0.047 (max possible= 0.25 ) Likelihood ratio test= 91 on 6 df, p=0 Wald test= 209 on 6 df, p=0 Score (logrank) test = 142 on 6 df, p=0, Robust = 17.4 p=0.00803 (Note: the likelihood ratio and score tests assume independence of observations within a cluster, the Wald and robust score tests do not). newdat Week LagAOO Prior.f2 Prior.f3 Prior.f4 Prior.f5 1 17.55218 1.1916931000 2 17.55218 1.1916930000 survfit(fit,newdata=newdat) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ In addition: Warning message: 'newdata' had 2 rows but variable(s) found have 1878 rows Regards, Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Survfit,newdata and continuous time varying covariates
Hi All, I am working with three time varying covariates in a coxph model. I cannot seem to figure out how to use survfit and the newdata argument to provide estimated survival curves for two scenarios of one covariate while holding the other two at the mean value. Is it possible to display how estimated survival depends on alternative scenarios/profiles of a time varying covariate? Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Estimate of baseline hazard in survival
Dear All, I'm having just a little terminology problem, relating the language used in the Hosmer and Lemeshow text on Applied Survival Analysis to that of the help that comes with the survival package. I am trying to back out the values for the baseline hazard, h_o(t_i), for each event time or observation time. Now survfit(fit)$surv gives me the value of the survival function, S(t_i|X_i,B), using mean values of the covariates and the coxph() object provides me with the estimate of the linear predictors, exp(X'B). If S(t_i|X_i,B)=S_o(t_i)^exp(X_iB) is the expression for the survival function And -ln(S_o(t_i) ) is the expression for the cumulative baseline hazard function, H_o(t_i) Then by rearranging the expression for the survival function I get the following: -ln(S_o(t_i) ) = -ln( S(t_i|X_i,B) ) / exp(X_iB) = basehaz(fit)/exp(fit$linear.predictors) Am I right so far and is there an easier way? The plot of the cumulative baseline hazard function , H_o(t_i), should be linear across time. Once I have, H_o(t_i), to get at h_o(t_i) I then need to reverse the cumsum operation. The corresponding plot should have a constant baseline hazard over time. I am aware of cox.zph() for testing the proportionality of hazards assumption. Thanks Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problem installing packages and weird R site behaviour
Hi, I tried to install a package using the menu option and was presented a list filled with NA's. I then tried visiting the R site and each option on the side bar (eg. CRAN, Search,FAQ) sends me to the address attached below (NB: I left off the h in http). The first problem seems to be related to the second. Is anyone else experiencing this behaviour and how do I restore normal behaviour? Regards Alex Problem 1 local({a - CRAN.packages() + install.packages(select.list(a[,1],,TRUE), .libPaths()[1], available=a, dependencies=TRUE)}) trying URL `http://cran.r-project.org/bin/windows/contrib/2.0/PACKAGES' Content type `text/html' length unknown opened URL downloaded 953 bytes Problem 2 {ttp://64.235.246.142/?a_id=794adultfilter=ondomainname=r-project.org} Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Hexidecimal conversion
Help I can produce the hexidecimal equivalent of a decimal number but I am having a hard time reversing the operation. I'm good for hex representations to 159 and am close to extending to 2559. The archives are not clear on the existence of a function for this task. Is there one? Here is what I have got so far: #Good for hex values to 9F as.decmode-function(as.character(x)){ hexDigit-c(0:9,LETTERS[1:6]) z-matrix(c(strsplit(x, NULL),recursive=T), length(x),2,byrow=T) z.1-as.numeric(z[,1]) z.2- match(z[,2],hexDigit)-1 dec-16*z.1+z.2 return(dec) } Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Hexidecimal conversion
Thanks to Patrick Burns, Peter Wolf and Duncan Murdoch who all provided me with workable solutions to the hexidecimal conversion problem. They all work and basically differ in the number of bells and whistles. Alex -Original Message- From: Duncan Murdoch [mailto:[EMAIL PROTECTED] Sent: December 2, 2004 9:42 AM To: Hanke, Alex Subject: Re: [R] Hexidecimal conversion On Wed, 01 Dec 2004 15:07:16 -0400, Hanke, Alex [EMAIL PROTECTED] wrote : Help I can produce the hexidecimal equivalent of a decimal number but I am having a hard time reversing the operation. I'm good for hex representations to 159 and am close to extending to 2559. The archives are not clear on the existence of a function for this task. Is there one? I don't think so. Here is what I have got so far: #Good for hex values to 9F as.decmode-function(as.character(x)){ hexDigit-c(0:9,LETTERS[1:6]) z-matrix(c(strsplit(x, NULL),recursive=T), length(x),2,byrow=T) z.1-as.numeric(z[,1]) z.2- match(z[,2],hexDigit)-1 dec-16*z.1+z.2 return(dec) } I think what you're missing is a loop over the characters. You can probably vectorize this to make it more efficient, but here's a sketch: hex2numeric - function(x) { hexDigits - c(0:9, LETTERS[1:6]) chars - strsplit(toupper(x), split=NULL) result - rep(0, length(chars)) for (i in seq(along=chars)) { for (j in seq(along=chars[[i]])) result[i] - 16*result[i] + match(chars[[i]][j], hexDigits) - 1 } result } Duncan Murdoch __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Google for scientists: search engine
A new way to search for documentation on your favourite science topic. http://scholar.google.com/ http://scholar.google.com/ Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RODBC and Table views
This question relates to the use of the RODBC package for retrieving data from a MS Access database. It is quite easy to retrieve data sitting in tables but is it possible to select from views based on these tables? The archives do not touch on this point. sqlTables() lets me see tables and views but only the tables yield data. Do I need to recreate these views on the R side of the connection? Regards Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
I get the following message: Error in isoMDS(tt) : zero or negative distance between objects 1 and 2 This makes sense since a and b are identical in their relationship to c to h. Drop row 1 and col 1 and you get isoMDS(tt[2:8,2:8]) initial value 14.971992 iter 5 value 8.027815 iter 10 value 4.433377 iter 15 value 3.496364 iter 20 value 3.346726 final value 3.233738 converged $points [,1] [,2] [1,] -2.3143653 -0.1259226 [2,] -0.3205746 -1.1534662 [3,] -2.8641922 -0.1182906 [4,] 0.7753674 0.1497328 [5,] -0.5705552 1.2416843 [6,] 2.2305175 -0.6995917 [7,] 3.0638025 0.7058540 $stress [1] 3.233738 Does this help? -Original Message- From: Doran, Harold [mailto:[EMAIL PROTECTED] Sent: September 9, 2004 8:26 AM To: Jari Oksanen Cc: Doran, Harold; Prof Brian Ripley; R-News Subject: RE: [R] isoMDS Thank you. I use the same matrix on cmdscale as I did with isoMDS. I have reproduced my steps below for clarification if this happens to shed any light. Here is the original total matrix (see opening thread if you care how this is created) a b c d e f g h a 4 4 2 4 1 2 0 0 b 4 4 2 4 1 2 0 0 c 2 2 4 2 3 2 2 1 d 4 4 2 4 1 2 0 0 e 1 1 3 1 4 3 3 2 f 2 2 2 2 3 4 2 1 g 0 0 2 0 3 2 4 3 h 0 0 1 0 2 1 3 4 So, there are 8 items. This matrix indicates that items 1,2, and 4 were always grouped together (or viewed as being similar by individuals). I transformed this using tt-max(t)-t which results in a b c d e f g h a 0 0 2 0 3 2 4 4 b 0 0 2 0 3 2 4 4 c 2 2 0 2 1 2 2 3 d 0 0 2 0 3 2 4 4 e 3 3 1 3 0 1 1 2 f 2 2 2 2 1 0 2 3 g 4 4 2 4 1 2 0 1 h 4 4 3 4 2 3 1 0 When I run isoMDS on this new matrix, it tells me to specify the initial config because of the NA/INFs/ But when I perform cmdscale on this same matrix I end up with the following results, bt-cmdscale(tt);bt [,1] [,2] a -1.79268634 -0.2662750 b -1.79268634 -0.2662750 c -0.02635497 0.5798934 d -1.79268634 -0.2662750 e 1.08978620 0.6265313 f -0.02635497 0.5798934 g 2.20852966 0.2828937 h 2.13245309 -1.2703869 The results suggest that items 1,2, and 4 have similar locations as is expected. Also items 3 and 6 have similar locations as would also be expected. So, my results seem to have been replicated correctly using cmdscale. I've tried to specify an initial config using isoMDS in a few ways without success, so I am surely doing something wrong. So far, I have tried the following: ll-isoMDS(tt, y=cmdscale(tt)) which tells me zero or negative distance between objects 1 and 2 ll-isoMDS(tt, y=cmdscale(tt, k=2)) Again, thanks, Harold -Original Message- From: Jari Oksanen [mailto:[EMAIL PROTECTED] Sent: Thu 9/9/2004 4:26 AM To: Doran, Harold Cc: Prof Brian Ripley; R-News Subject: RE: [R] isoMDS On Wed, 2004-09-08 at 21:31, Doran, Harold wrote: Thank you. Quick clarification. isoMDS only works with dissimilarities. Converting my similarity matrix into the dissimilarity matrix is done as (from an email I found on the archives) d- max(tt)-tt Where tt is the similarity matrix. With this, I tried isoMDS as follows: tt.mds-isoMDS(d) and I get the following error message. Error in isoMDS(d) : An initial configuration must be supplied with NA/Infs in d. I was a little confused on exactly how to specify this initial config. So, from here I ran cmdscale on d as This error message is quite informative: you have either missing or non-finite entries in your data. The only surprising thing here is that cmdscale works: it should fail, too. Are you sure that you haven't done anything with your data matrix in between, like changed it from matrix to a dist object? If the Inf/NaN/NA values are on the diagonal, they will magically disappear with as.dist. Anyway, if you're able to get a metric scaling result, you can manually feed that into isoMDS for the initial configuration, and avoid the check. See ?isoMDS. d.mds-cmdscale(d) which seemed to work fine and produce reasonable results. I was able to take the coordinates and run them through a k-means cluster and the results seemed to correctly match the grouping structure I created for this sample analysis. Cmdscale is for metric scaling, but it seemed to produce the results correctly. So, did I correctly convert the similarity matrix to the dissimilarity matrix? Second, should I have used cmdscale rather than isoMDS as I have done? Or, is there a way to specify the initial configuration that I have not done correctly. If you don't know whether you should use isoMDS or cmdscale, you probably
RE: [R] kolmogorov-smirnov for discrete ordinal scale data
Hi I think the answer is no. However, I have written a script that implements the test described in Testing for shifts in the Vertical Distribution of Plankton using a robust Kolmogorov-Smirnov like Statistic by Smith, Beet and Solow (1998). The test has the properties you are looking for. If this sounds helpful let me know. Alex -Original Message- From: Gila Lithwick [mailto:[EMAIL PROTECTED] Sent: September 9, 2004 11:03 AM To: [EMAIL PROTECTED] Subject: [R] kolmogorov-smirnov for discrete ordinal scale data Hi, I was wondering whether there is an implementation of the Kolmogorov-Smirnov goodness of fit test for discrete, ordinal scale data in R - I've only managed to find the test for continuous data. Thanks! Gila __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Skipping panels in Lattice
I have the same problem. As far as I can see, the only thing you can do is : attach(df2) group=paste(facb,facc,sep= ) bwplot( dv ~ faca | factor(group)) Alex -Original Message- From: Leon Barmuta [mailto:[EMAIL PROTECTED] Sent: September 9, 2004 1:19 AM To: [EMAIL PROTECTED] Subject: [R] Skipping panels in Lattice Dear all, I wish to generate a lattice boxplot which skips an empty cell in a design. I have trawled r-help, scruitinized xyplot(lattice) help page, and merrily reproduced examples of using skip from a couple of previous r-help queries and the example given in Pinheiro Bates. But I must be missing something... Here's an example (running R 1.9.1 on Win2k): # generate some data df1 - data.frame(expand.grid(obsnum=seq(1, 15, 1), faca=c(A1, A2, A3), facb=c(B1,B2, B3, B4), facc=c(C1,C2)), dv=rpois(15*3*4*2,10)) # now get rid of the cell B4 C1 to simulate a missing treatment combination df2 - df1[df1$facb !=B4 | df1$facc !=C1, ] # plain vanilla lattice plot generates an empty panel corresponding to the empty cell plot1 - bwplot( dv ~ faca | facb*facc, data=df2) plot1 # now try to skip the empty panel # turn plot history on so that the separate pages can be recalled plot2 - update(plot1, skip=c(rep(F, 3), T, rep(F, 4))) plot2 and the 4th panel position of the bottom row is skipped, BUT the B4C1 cell is shunted to the top left of row 1 and the last panel of plot1 is now moved to page 2. Messing with layout= doesn't help, neither does substituting NA for the values of the missing cell (instead of cutting it out of the data frame). I also get the same behaviour for stripplot and dotplot too. Apologies if I've missed a previous solution to this during my searches of the archive. Regards, Leon Barmuta School of Zoology TAFI, University of Tasmania, Australia. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
Distances cannot always be constructed from similarities. This can be done only if the matrix of similarities is nonnegative definite. With the nonnegative definite condition, and with the maximum similarity scaled so that s_ii=1, d_ik=(2*(1-s_ik))^-.5 Check out the vegan package. Alex -Original Message- From: Doran, Harold [mailto:[EMAIL PROTECTED] Sent: September 8, 2004 10:00 AM To: [EMAIL PROTECTED] Cc: Doran, Harold Subject: [R] isoMDS Dear List: I have a question regarding an MDS procedure that I am accustomed to using. I have searched around the archives a bit and the help doc and still need a little assistance. The package isoMDS is what I need to perform the non-metric scaling, but I am working with similarity matrices, not dissimilarities. The question may end up being resolved simply. Here is a bit of substantive background. I am working on a technique where individuals organize items based on how similar they perceive the items to be. For example, assume there are 10 items. Person 1 might group items 1,2,3,4,5 in group 1 and the others in group 2. I then turn this grouping into a binomial similarity matrix. The following is a sample matrix for Person 1 based on this hypothetical grouping. The off diagonals are the similar items with the 1's representing similarities. a b c d e f g h i j a 1 1 1 1 1 0 0 0 0 0 b 1 1 1 1 1 0 0 0 0 0 c 1 1 1 1 1 0 0 0 0 0 d 1 1 1 1 1 0 0 0 0 0 e 1 1 1 1 1 0 0 0 0 0 f 0 0 0 0 0 1 1 1 1 1 g 0 0 0 0 0 1 1 1 1 1 h 0 0 0 0 0 1 1 1 1 1 i 0 0 0 0 0 1 1 1 1 1 j 0 0 0 0 0 1 1 1 1 1 Each of these individual matrices are summed over individuals. So, in this summed matrix diagonal elements represent the total number of participants and the off-diagonals represent the number of times an item was viewed as being similar by members of the group (obviously the matrix is symmetric below the diagonal). So, a 4 in row 'a' column 'c' means that these items were viewed as being similar by 4 people. A sample total matrix is at the bottom of this email describing the perceived similarities of 10 items across 4 individuals. It is this total matrix that I end up working with in the MDS. I have previously worked in systat where I run the MDS and specify the matrix as a similarity matrix. I then take the resulting data from the MDS and perform a k-means cluster analysis to identify which items belong to a particular cluster, centroids, etc. So, here are my questions. 1) Can isoMDS work only with dissimilarities? Or, is there a way that it can perform the analysis on the similarity matrix as I have described it? 2) If I cannot perform the analysis on the similarity matrix, how can I turn this matrix into a dissimilarity matrix necessary? I am less familiar with this matrix and how it would be constructed? Thanks for any help offered, Harold a b c d e f g h i j a 4 2 4 3 3 2 0 0 0 0 b 2 4 2 3 1 0 2 2 2 2 c 4 2 4 3 3 2 0 0 0 0 d 3 3 3 4 2 1 1 1 1 1 e 3 1 3 2 4 3 1 1 1 1 f 2 0 2 1 3 4 2 2 2 2 g 0 2 0 1 1 2 4 4 4 4 h 0 2 0 1 1 2 4 4 4 4 i 0 2 0 1 1 2 4 4 4 4 j 0 2 0 1 1 2 4 4 4 4 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
I don't understand. If isoMDS does not work with distances, why does the help for isoMDS indicate that the Data are assumed to be dissimilarities or relative distances ? Equally confusing is the loose use of the terms dissimilarities and distances in the literature. As you point out in your book Distances are often called disimilarities. -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: September 8, 2004 11:58 AM To: Hanke, Alex Cc: 'Doran, Harold'; '[EMAIL PROTECTED]' Subject: RE: [R] isoMDS On Wed, 8 Sep 2004, Hanke, Alex wrote: Distances cannot always be constructed from similarities. This can be done only if the matrix of similarities is nonnegative definite. With the nonnegative definite condition, and with the maximum similarity scaled so that s_ii=1, d_ik=(2*(1-s_ik))^-.5 But isoMDDS works with dissimilarities not distances. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] A question about external time-dependent covariates in co x model
Dear Rui, From my understanding of time-dependent covariates (not an expert but have been working on a similar problem), it would appear that the coding of the status column is not correct. Unless you have observed an event at each interval you should only have status=1 for the last interval. In your example I see 3 in total. Also, I think that if end is proportional to your covariate you are incorporating a redundant time effect into the model. The time effect is in the baseline hazard. Alex -Original Message- From: Rui Song [mailto:[EMAIL PROTECTED] Sent: August 19, 2004 12:21 AM To: [EMAIL PROTECTED] Subject: [R] A question about external time-dependent covariates in cox model Dear Sir or Madam: I am a graduate student in UW-Madison statistics department. I have a question about fitting a cox model with external time-dependent covariates. Say the original data is in the following format: Obs Eventtime Status Cov(time=5) Cov(time=8) Cov(time=10) Cov(time=12) 1 5 1 2 2 8 0(censored) 2 4 3 10 1 2 4 6 4 12 1 2 4 6 8 Notice that the time-dependent covariates are identical at the same time points for all obs since they are external to the failure process. process. Then I organized the data as the following: obs start end eventtime status cov 1 0 5 5 1 2 2 0 5 8 0 2 2 5 8 8 0 4 3 0 5 10 1 2 3 5 8 10 1 4 3 8 10 10 1 6 4 0 5 12 1 2 4 5 8 12 1 4 4 8 10 12 1 6 4 10 12 12 1 8 And fit the model using: fit-coxph(Surv(start, end, status)~cov); When I fit the model to my data set (Which has 89 observations and 81 distinct time points, sort of large.), I always got a message that Process R segmentation fault (core dumped). Would you let me know if it is due to the matrix sigularity in the computation of the partial likelihood or something else? And how should I fit a cox model with external time-dependent covariates? Thanks a lot for your time and help! Sincerely, Rui Song __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Clustering and the test for proportional hazards
Dear List, Is the test for proportional hazards valid when the model contains a cluster variable? The output looks strange with the cluster variable. My intervals are based on calendar time and the clustering variable is related to the season the event occurs in. model1-coxph(Surv(Start,Stop,Event)~LagAOO+I(LagAOO^2)+ FirstSeen+TSLE+strata(CumPOO.rc)+cluster(quarter),data=data8, x=T) Cox.zph(model1) LagAOO 0.209 5.61 0.0179 I(LagAOO^2) -0.209 5.61 0.0179 FirstSeen 0.209 5.61 0.0179 TSLE 0.209 5.61 0.0179 GLOBALNA 5.61 0.2304 In this example there is strong evidence for non-proportional hazards for each of my covariates. By defining my strata differently I can remove the non-proportional hazards. A quick comment is appreciated. Alex [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Xtable method for coxph, bug?
Hi There is a xtable method for coxph. It bombs, however, when applied to my coxph object. It cannot find 'nse' which I think is sqrt(diag(coxph.object$naive.var)). Adding 'nse' to the coxph object cures the problem. Is this a bug in xtable.coxph? There is no xtable method for summary.coxph. How can I access the result of summary(coxph.object) so that I can create a matrix object for which there is an xtable method? Thanks Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Isotopic notation in plots
Henrik, Please try: plot(1:10,xlab=expression({}^{14}*C)) -Original Message- From: Andersson, Henrik [mailto:[EMAIL PROTECTED] Sent: May 18, 2004 7:00 AM To: [EMAIL PROTECTED] Subject: [R] Isotopic notation in plots I really like to use R for all my graphs, and as I work with stable isotopes I want to have a proper chemical notation in my plots I have looked at ?plotmath, but didn't find the answer and also searched the R website. -- plot(1:10,xlab=expression(^{14}*C)) # I want to have a superscript with nothing in front, but it doesn't work plot(1:10,xlab=expresssion(.^{14}*C)) # this works, but is not beautiful Any ideas ? - Henrik Andersson Netherlands Institute of Ecology - Centre for Estuarine and Marine Ecology P.O. Box 140 4400 AC Yerseke Phone: +31 113 577473 [EMAIL PROTECTED] http://www.nioo.knaw.nl/ppages/handersson __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] (no subject)
Dear R-Help Does The R Package for Multivariate and Spatial Analysis Version 4.0 (Casgrain and Legendre, 2001) exist on CRAN and under what name? It supposedly has a chronological clustering program ,CHRONO, that I would like to use. Alternatively, I would ask if there is a R based program that performs chronological clustering? Thanks Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] multiple plots problem
The command: layout(c(1,2,3), 3, 1) specifies 3 plots Try layout(1:4,2,2,byrow=T) Regards, Alex -Original Message- From: Oleg Bartunov [mailto:[EMAIL PROTECTED] Sent: April 1, 2004 7:39 AM To: R-help Subject: [R] multiple plots problem Hello, for testing learning purposes I create X11 device and specify layout like layout(c(1,2,3), 3, 1), so I could play with parameters and see several plots at the same time. That works fine until I try to create 4-th plot - all other plots erased. Such behaviour isn't desirable for testing purposes and I'm asking where to look to disable erasing other plots. Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] multiple plots problem
Correction: I should have wrote layout(matrix(c(1,2,3,4), 2, 2, byrow = TRUE)) Sorry Alex -Original Message- From: Hanke, Alex [mailto:[EMAIL PROTECTED] Sent: April 1, 2004 9:25 AM To: 'Oleg Bartunov'; '[EMAIL PROTECTED]' Subject: RE: [R] multiple plots problem The command: layout(c(1,2,3), 3, 1) specifies 3 plots Try layout(1:4,2,2,byrow=T) Regards, Alex -Original Message- From: Oleg Bartunov [mailto:[EMAIL PROTECTED] Sent: April 1, 2004 7:39 AM To: R-help Subject: [R] multiple plots problem Hello, for testing learning purposes I create X11 device and specify layout like layout(c(1,2,3), 3, 1), so I could play with parameters and see several plots at the same time. That works fine until I try to create 4-th plot - all other plots erased. Such behaviour isn't desirable for testing purposes and I'm asking where to look to disable erasing other plots. Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] identify() and controlling label size
I thought this was going to be easy ... Can the label size of identify() be controlled by setting par(cex.*) because I'm having no luck? My only recourse is to save the index and position of the labels from identify() and use text() to replot them. Regards Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] identify() and controlling label size
Thank-you it works! I have ignored ps and relied on the cex arguments until now. Alex -Original Message- From: Barry Rowlingson [mailto:[EMAIL PROTECTED] Sent: March 31, 2004 10:59 AM To: Hanke, Alex Cc: '[EMAIL PROTECTED]' Subject: Re: [R] identify() and controlling label size Hanke, Alex wrote: I thought this was going to be easy ... Can the label size of identify() be controlled by setting par(cex.*) because I'm having no luck? My only recourse is to save the index and position of the labels from identify() and use text() to replot them. Funny, it seems to ignore that, but allows 'col=' and 'font=' to set the font colour and type. But! It accepts 'ps=' to set the point size! You may be saved... Try: plot(1:10) identify(1:10,ps=24) identify(1:10,ps=12) Baz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] strange thing with sd
Just so you don't feel that you are alone I get the same response as you use R1.8.0 on an XP operating system. Alex -Original Message- From: Andreas Pauling [mailto:[EMAIL PROTECTED] Sent: March 29, 2004 10:02 AM To: [EMAIL PROTECTED] Subject: [R] strange thing with sd Dear R people I came across a strange thing: sd(rep(0.01, 15)) #0 sd(rep(0.001, 15)) #4.489023e-19 sd(rep(0.1, 15)) #0 sd(rep(0.0001,15)) #1.712427e-24 sd(rep(0.01, 13)) #1.805557e-18 sd(rep(0.001, 13)) #4.513894e-19 sd(rep(0.1, 13)) #0 sd(rep(0.0001,13)) #0 sd(rep(5.01, 15)) #0 sd(rep(5.001, 15)) #4.489023e-19 sd(rep(5.1, 15)) #1.838704e-15 sd(rep(5.0001,15)) #9.19352e-16 sd(rep(5.01, 13)) #9.244454e-16 sd(rep(5.001, 13)) #9.244454e-16 sd(rep(5.1, 13)) #1.848891e-15 sd(rep(5.0001,13)) #0 Why gives sd sometimes zero and sometimes values close to zero and why does it depend on the value and on how many times it is repeated? Shouldn't it give always zero? Is there a way to control this? I use R Version 1.8.1 under UNIX. Thanks for any suggestions!! Andreas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Setting the 'fig' graphic parameter
Try: x - rnorm(100) y - rnorm(100) par(fig=c(0,0.7,0,1)) plot(x,y) # (please maximise your plotting device so you get a 'rectangular' area) # now lets do the upper corner 'little' plot par(fig=c(0.7, 1, 0.5, 1),new=T) plot(x,y) # and ... par(fig=c(0.7, 1, 0, 0.5),new=T) plot(x,y) -Original Message- From: Mario dos Reis [mailto:[EMAIL PROTECTED] Sent: March 22, 2004 12:23 PM To: [EMAIL PROTECTED] Subject: [R] Setting the 'fig' graphic parameter Hi guys, # I would like to plot a figure with the following layout: # # # | || # | || # | || # | || # | || # | || # | || # x - rnorm(100) y - rnorm(100) par(fig=c(0,0.7,0,1)) plot(x,y) # (please maximise your plotting device so you get a 'rectangular' area) # now lets do the upper corner 'little' plot par(fig=c(0.7, 1, 0.5, 1)) plot(x,y) # and ... par(fig=c(0.7, 1, 0, 0.5)) plot(x,y) Now, my problem is as you might have seen already, that the old figure gets deleted when the new one is placed. I was trying to reproduce an exercise a saw in an S-plus book. I would really like to know what is going on, the documentation about graphic parameters is not very helpful about fig, and I would really like to set a graph with the above layout. Thanks, Mario dos Reis. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Help to compare...
DF-data.frame(V1=c(0,8,6,4,3,1,2,9,6,5),V2=1:10) DF[((DF$V1=2 DF$V18)*1:10)[2:8],] -Original Message- From: joseclaudio.faria [mailto:[EMAIL PROTECTED] Sent: March 22, 2004 1:28 PM To: [EMAIL PROTECTED] Subject: [R] Help to compare... Dear list, I'm needing submit values (V1 = 8,6,4,3,1,2,9) (Id = 2:8) of a data.frame (DF), like below Id V1 V2 ... 101 ... 2810 ... 362 ... 444 ... 537 ... 618 ... 726 ... 897 ... 961 ... 10 54 ... to selection (=2 and 8) for remanescents like below: Id V1 V2 ... 101 ... 2. 10 ... 362 ... 444 ... 537 ... 6. 8 ... 726 ... 8. 7 ... 961 ... 10 54 ... how to do that betther with R? Is there a command to compare all to same time? I would be very thankful. Yours sincerly José Cláudio Faria UESC/DCET Brasil 73-634.2779 [EMAIL PROTECTED] [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Reading Data
The following response by B.Ripley to a similar request may help. Alex On Tue, 21 Oct 2003, Ernie Adorio wrote: Am using R on a Linux box and am currently writing an interactive R script. 1. How do I ask a user to press any key to continue ? I used a system call to read but this only works if the Enter key is pressed: print(Press any key to continue) system(read) You seem over-fond of the system() command! Try cat(Press enter key to continue\n) foo - readLines(stdin(), 1) In general R does not have a character-by-character interface with a keyboard. 2. How do I get a string input from the user? Would like to see an R function, say askget(): delay = askget(Enter delay in seconds) system(paste( sleep , delay)) cat(Enter delay in seconds:\n) Delay - scan(, numeric(1)) Sys.sleep(Delay) will do it (there are other more elegant ways such as testit - function() { cat(Enter delay in seconds: ) Delay - as.numeric(readLines(stdin(), 1)) if(is.na(Delay)) stop(non-numeric input) Sys.sleep(Delay) } ) -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -Original Message- From: Kissell, Robert [EQRE] [mailto:[EMAIL PROTECTED] Sent: March 19, 2004 12:47 PM To: [EMAIL PROTECTED] Subject: [R] Reading Data Hi, Quick question on reading data. I am running simulations but want to allow the user the option to define the number of simulations. How can I have R read-in user data entered from the keyboard? Is there a difference for reading in numeric and character data? For example, I am trying to write the following in R: Enter Number of Iterations? the user then enters a number say Y X - Y # i then want to read that number into X Thanks. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] How to ascertain the number of clusters automatically?
Hi, You may be interested in a clustering algorithm called OPTICS. It is both interactive and automatic and does not require a lot of input parameters. It is described as creating an augmented ordering of the data representing its density-based clustering structure. It automatically and efficiently extracts not only traditional clustering information but also the intrinsic clustering structure. Check out this site: http://www.dbs.informatik.uni-muenchen.de/Forschung/KDD/Clustering/ Alex -Original Message- From: Fucang Jia [mailto:[EMAIL PROTECTED] Sent: March 9, 2004 11:30 AM To: [EMAIL PROTECTED] Subject: [R] How to ascertain the number of clusters automatically? Hi, everyone, There is many small cells which can be classified into several big cells from the scanned image. K-means clustering does not work well in this condition. I have done hierarchical clustering on cells successfully which uses shortest distance between classes. The number of clusters is about 3, 4, 5, 6, 7 generally. One can ascertain the number of clusters visually. But because there are thousands of images to be clustered. So it is humdrum to me. I want to know if there are any methods that can be used to ascertain the number of clusters automatically, especially in this case, only several clusters? Thank you very much! Best, Fucang __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Rank Simulations - Test statistic Help
Dear S, Try rephrasing your question instead of altering the subject line. I can see why you haven't any takers. Alex -Original Message- From: S P [mailto:[EMAIL PROTECTED] Sent: March 10, 2004 9:59 AM To: [EMAIL PROTECTED] Subject: [R] Rank Simulations - Test statistic Help Hi all, I am a biostatistician and I have developed my own ranking system for clinical data. I would like to test the efficiency of it w.r.t. to other ranking systems. I would like to simulate the data and after assigning ranks to my observed scores(after neglecting dropouts), observe the type I error. If I want to do a Kruuskal Wallis type of test, what test statistic should I use to test for a difference of delta between the two groups (I know delta since the data is generated with that difference). The default K-W test statistics test for a NULL difference and I want to see how frequently my ranking system rejects the correct null hypothesis of a delta difference. (So my data is now in 2 arrays of unequal lengths due to dropouts, which have the ranks for drug and placebo.) Thank you in advance for your help. ~S __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] boot package
Hi, The function anosim() in vegan package or sample() in base may be of help to you. Alex -Original Message- From: Rogério Rosa da Silva [mailto:[EMAIL PROTECTED] Sent: March 4, 2004 4:57 AM To: rhelp Subject: [R] boot package Dear all As part of an ongoing study on the ecomorphology of ant communities, I have obtained a matrix with 156 row (species) and 20 columns (several measurements of body shape) for 4 localities. For each community, I calculated a matrix of Euclidean distances between all pairs of species. From this matrix, I extracted two measures of community structure: i) I identified the distance from a individual to its nearest neighbor (NND) in the morphological space and then calculated the averages of the NND (MNND); ii) the mean of the Euclidean distances (MED). NND and MED are of practical use in describing spatial relations between species. The results of Euclidean distance studies in each of the four localities were compared with each other for evidence repeating patterns. To determine wheter ou not the morphological arrangement of species in communities reflected internal organization, MNND and MED should be compared with randomly generated communities. It is possible to write a function in th boot package to evaluate the values of average nearest-neighbor distance and of the mean of Euclidean distances in randomly generated communities ? One set of 1000 random communities each with 78, 90, 100 and 102 species must be generaded from the pool of species (156 species). The only restrictions on species composition is that no species could be placed in a community more than once, and only rows values can be permuted. Sorry, I don't know how to define the statistic term in the boot() function from the boot library in R. I have studied the boot package, argument list of functions and the functions definition. Can someone point me the correct syntax? I would very much appreciate any help Best regards Rogério __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] limit to complex plots?
try: layout(matrix(c(0,0,0,0,0,1,0,2,0,0,0,0,0,3,0,4), nrow=4, byrow=TRUE)) plot(rnorm(10), rnorm(10)) plot(rnorm(20), rnorm(20)) plot(rnorm(30), rnorm(30)) plot(rnorm(40), rnorm(40)) -Original Message- From: [EMAIL PROTECTED] [SMTP:[EMAIL PROTECTED] Sent: Thursday, February 26, 2004 7:15 AM To: [EMAIL PROTECTED] Subject: [R] limit to complex plots? Hello all. I am trying to create one figure, divided into 6 graphs/plots each with an inset sub-figure. I can use to layout command to generate one figure with one inset sub-figure, but cannot seem to do it for multiple figures on one page. I've tried a test with the following code: layout(matrix(c(1,2,3,4), nrow=2, byrow=TRUE)) plot(rnorm(10), rnorm(10)) plot(rnorm(10), rnorm(10)) plot(rnorm(30), rnorm(30)) plot(rnotm(40), rnorm(40)) layout.show(4) #this works and gives me my one page with 4 figures on it layout(matrix(c(0,0,0,0,0,1,0,2,0,0,0,0,0,3,0,4), nrow=4, byrow=TRUE)) par(new=TRUE) plot(rnorm(10), rnorm(10)) par(new=TRUE) plot(rnorm(20), rnorm(20)) par(new=TRUE) plot(rnorm(30), rnorm(30)) par(new=TRUE) plot(rnorm(40), rnorm(40)) # this is the part that doesn't. I've tried only one 'par(new=TRUE)' command before ALL the plot commands and as written above. The best I can get is 3 sub-figures #2,3 and 4, in positions 1,2 and 3. Has anyone figured this out? thanks, Suzanne Blatt __ Introducing the New Netscape Internet Service. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] strptime() behaviour
Is it normal behaviour for strptime(29-Jan-01,%d-%b-%y)$mon to return a value of 0? strptime(29-Jan-01,%d-%b-%y)$year #works ok 101 strptime(29-Jan-01,%d-%b-%y)$mday #works ok 29 Regards, Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE:[R] reshape direction=wide
v.names=c(var1,var2) creates a separate column for each combination of variables in v.names and level of variable identified by timevar. I am reshaping a data.frame bids -- reshaped as shown below. I thought this should be possible with a single invocation of reshape, but the only way I came up with is reshaping subsets for each keyword and then joining them together. Does anyone have an idea how to solve this in a more elegant way? Efficiency is a concern as the datasets are very large. Is there a way to specify multiple v.names? Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] glm.poisson.disp versus glm.nb
In response to my own query (see below), The estimate theta from glm.nb is actually 1/phi or 1/alpha in some texts, where phi is the dispersion parameter for the negative binomial distribution. However, the dispersion estimate from glm.poisson.disp does not equal 1/theta because 1/theta is a maximum likelihood (ML) estimate whereas the dispersion estimate from glm.poisson.disp is based on the method of moments (MM). Correct me if I'm wrong! The two estimates for dispersion are fairly different. Paul and Banerjee (1998) observe that the test statistic discussed below (TNBI) is more conservative when one uses the moment generated estimates for phi, and that phi is underestimated by ML and overestimated by MM. However, the overestimation by MM is slight compared to the underestimation by ML. Does that then mean MM estimates of phi are to be preferred? Alex Dear list, This is a question about overdispersion and the ML estimates of the parameters returned by the glm.poisson.disp (L. Scrucca) and glm.nb (Venables and Ripley) functions. Both appear to assume a negative binomial distribution for the response variable. Paul and Banerjee (1998) developed C(alpha) tests for interaction and main effects, in an unbalanced two-way layout of counts involving two fixed factors, when data are Poisson distributed, and when data are extra dispersed. In R I coded their C(alpha) test statistic (called TNBI) for interaction for the case where the counts are extra-dispersed, as well as their test for extra-dispersion (called T_a). Using the Quine data set (Quine, 1975) the authors collapse the orginal 4x2x2x2 study design into a 2x4 table and obtained estimates of TNBI=10.36 and T_a=90.81. Using the dispersion estimate from glm.poisson.disp and the estimates for mu I got exactly the same results for TNBI and T_a. This made me happy. Now I thought to try the ML estimates from glm.nb to see if the results would be the same but I am having difficulty relating the dispersion phi from glm.poisson.disp to theta estimated by glm.nb. According to the R help for glm.poisson.disp Var(y_i) = mu_i(1+mu_i*phi) . The help for glm.nb lead me to a book by VR (1994) which indicates that Var(y)=mu+mu^2/theta. From this I gathered that phi=1/theta but the estimates do not relate to each other in this way unless one is in error. In a document by L.P. Ammann he says a negative binomial model can be specified with mean mu and dispersion phi by taking theta=mu/(phi-1). I had a problem implementing this because in my mind mu is a vector whereas phi and theta are scalars. Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] glm.poisson.disp versus glm.nb
Dear list, This is a question about overdispersion and the ML estimates of the parameters returned by the glm.poisson.disp (L. Scrucca) and glm.nb (Venables and Ripley) functions. Both appear to assume a negative binomial distribution for the response variable. Paul and Banerjee (1998) developed C(alpha) tests for interaction and main effects, in an unbalanced two-way layout of counts involving two fixed factors, when data are Poisson distributed, and when data are extra dispersed. In R I coded their C(alpha) test statistic (called TNBI) for interaction for the case where the counts are extra-dispersed, as well as their test for extra-dispersion (called T_a). Using the Quine data set (Quine, 1975) the authors collapse the orginal 4x2x2x2 study design into a 2x4 table and obtained estimates of TNBI=10.36 and T_a=90.81. Using the dispersion estimate from glm.poisson.disp and the estimates for mu I got exactly the same results for TNBI and T_a. This made me happy. Now I thought to try the ML estimates from glm.nb to see if the results would be the same but I am having difficulty relating the dispersion phi from glm.poisson.disp to theta estimated by glm.nb. According to the R help for glm.poisson.disp Var(y_i) = mu_i(1+mu_i*phi) . The help for glm.nb lead me to a book by VR (1994) which indicates that Var(y)=mu+mu^2/theta. From this I gathered that phi=1/theta but the estimates do not relate to each other in this way unless one is in error. In a document by L.P. Ammann he says a negative binomial model can be specified with mean mu and dispersion phi by taking theta=mu/(phi-1). I had a problem implementing this because in my mind mu is a vector whereas phi and theta are scalars. Consequently, I would like to know how to get phi from theta so that I can compare the glm.poisson.disp and glm.nb methods for estimating dispersion. Regards, Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html