Re: [R] scope argument in step function
What I wondered is if I pass the 'upper=~.' , it seems step() thinks the full model is current one. Not adding anymore. If this is the right answer, is there a better way than creating fmla argument in the above? Yes, that is exactly what the help page for step says it means. (So why are you unsure and asking?) upper = terms(Response ~ ., data=ds3)) would appear to be a simpler way to get your intention. On Fri, 1 Jul 2005, Young Cho wrote: Thanks a lot for help in advance. I am switching from matlab to R and I guess I need some time to get rolling. I was wondering why this code : fit.0 - lm( Response ~ 1, data = ds3) step(fit.0,scope=list(upper=~.,lower=~1),data=ds3) Start: AIC= -32.66 Response ~ 1 Call: lm(formula = Response ~ 1, data = ds3) Coefficients: (Intercept) 1.301 is not working different from the following: cnames - dimnames(ds3)[[2]] cnames - cnames[-444]# last col is Response fmla - as.formula(paste( ~ ,paste(cnames,collapse=+))) step(fit.0,scope=list(upper=fmla,lower=~1),data=ds3) Start: AIC= -32.66 Response ~ 1 fmla - as.formula(paste( ~ ,paste(cnames,collapse=+))) fit.s - step(fit.0,scope=list(upper=fmla,lower=~1),data=ds3) Step: AIC= -Inf Response ~ ENTP9324 + CH1W0281 Df Sum of Sq RSS AIC none0 -Inf - CH1W0281 3 0.00381 0.00381 -115 - ENTP9324 9 1 1 -34 The dataframe ds3 is 17 by 444 and I understand it is not smart thing to run stepwise regression. What I wondered is if I pass the 'upper=~.' , it seems step() thinks the full model is current one. Not adding anymore. If this is the right answer, is there a better way than creating fmla argument in the above? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Generating correlated data from uniform distribution
While you are looking at weird distributions, here is one that we have used in experiments on noise masking to explore the bandwidth of visual mechanisms D'Zmura, M., Knoblauch, K. (1998). Spectral bandwidths for the detection of color. Vision Research, 20, 3117-28 and G. Monaci, G. Menegaz, S. Susstrunk and K. Knoblauch Chromatic Contrast Detection in Spatial Chromatic Noise Visual Neuroscience, Vol. 21, No 3, pp. 291-294, 2004 N - 1 x - runif(N, -.5,.5) y - runif(N, -abs(x), abs(x)) plot(x,y) y is not uniform but it is conditional on x. The plot reveals why we called this sectored noise. HTH ken Jim Brennan jfbrennan at rogers.com writes: Yes you are right I guess this works only for normal data. Free advice sometimes comes with too little consideration :-) Worth every cent... Sorry about that and thanks to Spencer for the correct way. Hmm, but is it? Or rather, what is the relation between the correlation of the normals and that of the transformed variables? Looks nontrivial to me. Incidentally, here's a way that satisfies the criteria, but in a rather weird way: N - 1 rho - .6 x - runif(N, -.5,.5) y - x * sample(c(1,-1), N, replace=T, prob=c((1+rho)/2,(1-rho)/2)) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 Ken Knoblauch Inserm U371, Cerveau et Vision Department of Cognitive Neurosciences 18 avenue du Doyen Lepine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: 06 84 10 64 10 http://www.lyon.inserm.fr/371/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to call sas in R
Hello all, I would like to know how to call sas code in R. Since I simulate data in R and I need to use sas code (garch-t,egarch and gjr) to estimate it. I need to simulate 500 times with 2000 obs. How I can call that code in R.Also, how I can keep the parameters from the estimate. j=1:500 i=1:2000 sas code keep parameters. Best Appreciate, Luck __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Generating correlated data from uniform distribution
Jim Brennan [EMAIL PROTECTED] writes: OK now I am skeptical especially when you say in a weird way:-) This may be OK but look at plot(x,y) and I am suspicious. Is it still alright with this kind of relationship? ... N - 1 rho - .6 x - runif(N, -.5,.5) y - x * sample(c(1,-1), N, replace=T, prob=c((1+rho)/2,(1-rho)/2)) Well, the covariance is (everything has mean zero, of course) E(XY) = (1+rho)/2*EX^2 + (1-rho)/2*E(X*-X) = rho*EX^2 The marginal distribution of Y is a mixture of two identical uniforms (X and -X) so is uniform and in particular has the same variance as X. In summary, EXY/sqrt(EX^2EY^2) == rho So as I said, it satisfies the formal requirements. X and Y are uniformly distributed and their correlation is rho. If for nothing else, I suppose that this example is good for demonstrating that independence and uncorrelatedness is not the same thing. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Is it possible to use glm() with 30 observations?
On 2 Jul 2005, at 06:01, Spencer Graves wrote: The issue is not 30 observations but whether it is possible to perfectly separate the two possible outcomes. Consider the following: tst.glm - data.frame(x=1:3, y=c(0, 1, 0)) glm(y~x, family=binomial, data=tst.glm) tst2.glm - data.frame(x=1:1000, y=rep(0:1, each=500)) glm(y~x, family=binomial, data=tst2.glm) The algorithm fits y~x to tst.glm without complaining for tst.glm, but issues warnings for tst2.glm. This is called the Hauck-Donner effect, and RSiteSearch(Hauck-Donner) just now produced 8 hits. For more information, look for Hauck-Donnner in the index of Venables, W. N. and Ripley, B. D. (2002) _Modern Applied Statistics with S._ New York: Springer. Not exactly. The phenomenon that causes the warning for tst2.glm above is more commonly known as complete separation. For some comments on its implications you might look at another work by B D Ripley, the 1996 book Pattern Recognition and Neural Networks. There are some further references in the help files of the brlr package on CRAN. The problem noted by Hauck and Donner (1997, JASA) is slightly related, but not the same. See the aforementioned book by Venables and Ripley, for example. The glm function does not routinely warn us about the Hauck-Donner effect, afaik. The original poster did not say what was the purpose of the logistic regression was, so it is hard to advise. Depending on the purpose, the separation that was detected may or may not be a problem. Regards, David (If you don't already have this book, I recommend you give serious consideration to purchasing a copy. It is excellent on many issues relating to statistical analysis and R. Spencer Graves Kerry Bush wrote: I have a very simple problem. When using glm to fit binary logistic regression model, sometimes I receive the following warning: Warning messages: 1: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, What does this output tell me? Since I only have 30 observations, i assume this is a small sample problem. Is it possible to fit this model in R with only 30 observations? Could any expert provide suggestions to avoid the warning? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [EMAIL PROTECTED] www.pdf.com http://www.pdf.com Tel: 408-938-4420 Fax: 408-280-7915 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Is it possible to use glm() with 30 observations?
On 02-Jul-05 Kerry Bush wrote: I have a very simple problem. When using glm to fit binary logistic regression model, sometimes I receive the following warning: Warning messages: 1: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, What does this output tell me? Since I only have 30 observations, i assume this is a small sample problem. It isn't. Spencer Graves has shown clearly with two examples that you can get a fit with no warnings with 3 observations, and a fit with warnings for 1000 observations. As he says, it arises when you get perfect separation with respect to the linear model. However, it may be worth expanding Spencer's explanation. With a single explanatory variable x (as in Spencer's examples), perfect separation occurs when y = 0 for all x = some x0, and y = 1 for all x x0. One of the parameters in the linear model is the scale parameter (i.e. the recipsorcal of the slope). If you express the model in the form logit(P(Y=1;x)) = (x - mu)/sigma then sigma is the scale parameter in question. As sigma - 0, P(Y=1;x) - 0 for x mu, and - 1 for x mu. Therefore, for any value of mu between x0 (at and below which all y=0 in your data) and x1 (the next larger value of x, at and above which all y=1), letting sigma - 0 gives a fit which perfectly predicts your y-values: it predicts P(Y=1) = 0, i.e. P(Y=0) = 1, for x mu, and predicts P(Y=1) = 1 for x mu; and this is exactly what you have observed in the data. So it's not a disaster -- in model-fitting terms, it is a resounding success! However, in real life one does not expect to be dealing with a situation where the outcomes are so perfectly predictable, and therefore one views such a result with due mistrust. One attributes the perfect separation not to perfect predictability, but to the possibility that, by chance, all the y=0 occur at lower values of x, and all the y=1 at higher values of x. Is it possible to fit this model in R with only 30 observations? Could any expert provide suggestions to avoid the warning? Yes! As Spencer showed, it is possible with 3 -- but of course it depends on the outcomes y. As to a suggestion to avoid the warning -- what you really need to avoid is data where the x-values are so sparse in the neighbourhood of the P(Y=1;x) = 0.5 area that it becomes likely that you will get y-values showing perfect separation. What that means in practice is that, over the range of x-values such that P(Y=1;x) rises from (say) 0.2 to 0.8 (chosen for illustration), you should have several x values in your data. Then the phenomonon of perfect separation becomes unlikely. But what that means in real life is that you need enough data, over the relevant range of x-values, to enable you to obtain (with high probability) a non-zero estimate of sigma (i.e. an estimate of slope 1/sigma which is not infinite) -- i.e. that you have enough data, and in the right places, to estimate the rate at which the probability increases from low to high values. (Theoretically, it is possible that you get perfect separation even with well-distributed x-values and sigma 0; but with well-distributed x-values the chance that this would occur is so small that it can be neglected). So, to come back to your particular case, the really meaningful suggestion for avoiding the warning is that you need better data. If your study is such that you have to take the x-values as they come (as opposed to a designed experiement where you can decide what they are to be), then this suggestion boils down to get more data. What that would mean depends on having information about the smallest value of sigma (largest value of slope) that is *plausible* in your context. Your data are not particularly useful, since they positively encourage adopting sigma=0. So objective information about this could only come from independent knowledge. However, as a rule of thumb, in such a situation I would try to get more data until I had, say, 10 x-values roughly evenly distributed between the largest for which y=0 and the smallest for which y=1. If that didn't work first time, then repeat using the extended data as starting-point. Or simply sample more data until the phenomenonof perfect separation was well avoided, and the S.D. of the x-coefficient was distinctly smaller than the value of the x-coefficient. Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 02-Jul-05 Time: 10:45:04 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help
Re: [R] Generating correlated data from uniform distribution
On 02-Jul-05 Peter Dalgaard wrote: Jim Brennan [EMAIL PROTECTED] writes: OK now I am skeptical especially when you say in a weird way:-) This may be OK but look at plot(x,y) and I am suspicious. Is it still alright with this kind of relationship? ... N - 1 rho - .6 x - runif(N, -.5,.5) y - x * sample(c(1,-1), N, replace=T, prob=c((1+rho)/2,(1-rho)/2)) Well, the covariance is (everything has mean zero, of course) E(XY) = (1+rho)/2*EX^2 + (1-rho)/2*E(X*-X) = rho*EX^2 The marginal distribution of Y is a mixture of two identical uniforms (X and -X) so is uniform and in particular has the same variance as X. In summary, EXY/sqrt(EX^2EY^2) == rho So as I said, it satisfies the formal requirements. X and Y are uniformly distributed and their correlation is rho. If for nothing else, I suppose that this example is good for demonstrating that independence and uncorrelatedness is not the same thing. That was a nice sneaky solution! I was toying with something similar, but less sneaky, until I saw Peter's, on the lines of x-runif(2N, -0.5,0.5); ix-(N-k):(N+k); y-x; y[ix]-(-y[ix]) (which makes the same point about independence and correlation). The larger k as a fraction of N, the more you swing from rho = 1 to rho = -1, but you cannot achieve, as Peter did, an arbitrary correlation coefficient rho since the value depends on k which can only take discrete values. Another approach which leads to a less special joint distribution is x-sort(runif(N, -0.5,0.5)); y-sort(runif(N, -0.5,0.5)) followed by a rho-dependent permutation of y. I'm still pondering a way of choosing the permutation so as to get a desired rho. The extremes are the identity, which for a given sample will give as close as you can get to rho = +1, and reversal, which gives as close as you can get to rho = -1. However, the maximum theoretical rho which you can get (as opposed to what is possible for particular samples, which may get arbitrarily close to +1) depends on N. For instance, with N=3, it looks as though the theoretical rho is about 0.9 with the identity permutation (for N=1000, however, just about all samples give rho 0.99). I smell a source of interesting exam questions ... Over to you! Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 02-Jul-05 Time: 12:22:09 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] interrupted Y axis
I did not find an answer to my question after a quick search using the R search engine so thought I'd ask away: Does any know if there's a function exists to create an interrupted Y axis? What I mean by interrupted Y axis is that part of the Y axis has been removed or excised to permit one to see parts of the data in more detail. Perhaps an example will make this clear. Please go to http://www.jbc.org/cgi/reprint/274/41/28950 and open the PDF document located there. Go to page 4, figure 2c provides a crude example of what I mean by interrupted Y axis. Part of the Y axis between 800 and 4500 has been removed to permit easy inspection of the upper end of the range of data. (This is not my work but simply an example of what I'm trying to describe.) One finds these interrupted Y axis graphs in newspapers or other periodicals, more often than not as a bar chart. Does a function in R exist to permit on to this easily to a graph? If not, would such a function be useful? If yes, would the grid package be the right tool for me to try and implement this? ~Nick __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Cluster of iris data set from Mahalanobis distances
Dear R list, My question: I'm trying to calculate Mahalanobis distances for 'Species' from the iris data set as below: cat('\n') cat('Cluster analyse of iris data\n') cat('from Mahalanobis distance obtained of SAS\n') cat('\n') n = 3 dat = c(0, 89.86419, 0, 179.38471,17.20107, 0) D = matrix(0, n, n) nam = c('Set', 'Ver', 'Vir') rownames(D) = nam colnames(D) = nam k = 0 for (i in 1:n) { for (j in 1:i) { k = k+1 D[i,j] = dat[k] D[j,i] = dat[k] } } D=sqrt(D) #D2 - D dendroS = hclust(as.dist(D), method='single') dendroC = hclust(as.dist(D), method='complete') win.graph(w = 3.5, h = 6) split.screen(c(2, 1)) screen(1) plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue') screen(2) plot(dendroC, main='Complete', sub='', xlab='', col='red') I'm not fouding how to make it in the CRAN documentation (Archives, packages: mclust, cluster, fpc and mva). Could help me? Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] interrupted Y axis
On 7/2/05, Nick Drew [EMAIL PROTECTED] wrote: I did not find an answer to my question after a quick search using the R search engine so thought I'd ask away: Does any know if there's a function exists to create an interrupted Y axis? What I mean by interrupted Y axis is that part of the Y axis has been removed or excised to permit one to see parts of the data in more detail. Perhaps an example will make this clear. Please go to http://www.jbc.org/cgi/reprint/274/41/28950 and open the PDF document located there. Go to page 4, figure 2c provides a crude example of what I mean by interrupted Y axis. Part of the Y axis between 800 and 4500 has been removed to permit easy inspection of the upper end of the range of data. (This is not my work but simply an example of what I'm trying to describe.) One finds these interrupted Y axis graphs in newspapers or other periodicals, more often than not as a bar chart. Does a function in R exist to permit on to this easily to a graph? If not, would such a function be useful? If yes, would the grid package be the right tool for me to try and implement this? See axis.break in the plotrix package. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] interrupted Y axis
On Sat, 2005-07-02 at 06:09 -0700, Nick Drew wrote: I did not find an answer to my question after a quick search using the R search engine so thought I'd ask away: Does any know if there's a function exists to create an interrupted Y axis? What I mean by interrupted Y axis is that part of the Y axis has been removed or excised to permit one to see parts of the data in more detail. Perhaps an example will make this clear. Please go to http://www.jbc.org/cgi/reprint/274/41/28950 and open the PDF document located there. Go to page 4, figure 2c provides a crude example of what I mean by interrupted Y axis. Part of the Y axis between 800 and 4500 has been removed to permit easy inspection of the upper end of the range of data. (This is not my work but simply an example of what I'm trying to describe.) One finds these interrupted Y axis graphs in newspapers or other periodicals, more often than not as a bar chart. Does a function in R exist to permit on to this easily to a graph? If not, would such a function be useful? If yes, would the grid package be the right tool for me to try and implement this? ~Nick Using: RSiteSearch(axis break) comes up with 78 hits. You might want to consider some of the approaches taken, including the axis.break() function in the 'plotrix' package on CRAN. Note however that the examples (ie. newspapers) you site are called Pop Charts by Bill Cleveland in his book The Elements of Graphing Data and are highly criticized. If you have such disparate ranges you might want to consider separate plots and/or log scaling. Be careful however, that expanding the vertical axis to isolate specific data can result in the visual perception of significant difference where none exists. This notion is also explored by Cleveland and Tufte (The Visual Display of Quantitative Information). HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Is it possible to use glm() with 30 observations?
I agree with Ted: in model-fitting terms, it is a resounding success! With any data set having at least one point with a binomial yield of 0 or 100%, you can get this phenomenon by adding series of random numbers sequentially to a model. Eventually, you will add enough variables that some linear combination of the predictors will provide perfect prediction for that case. In such cases, the usual asymptotic normality of the parameter estimates breaks down, but you can still test using 2*log(likelihood ratio) being approximately chi-square with anova(fit1, fit2), as explained in Venables and Ripley and many other sources, e.g., ?anova.glm: tst2.glm - data.frame(x=1:1000, + y=rep(0:1, each=500)) fit0 - glm(y~1, family=binomial, data=tst2.glm) fit1 - glm(y~x, family=binomial, data=tst2.glm) Warning messages: 1: algorithm did not converge in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, anova(fit0, fit1, test=Chisq) Analysis of Deviance Table Model 1: y ~ 1 Model 2: y ~ x Resid. Df Resid. Dev Df Deviance P(|Chi|) 1 999 1386.3 2 998 6.764e-05 1 1386.3 1.999e-303 This effect exposes a limit to the traditional advice for experimental design to test only at two levels for each experimental factor, and put them as far apart as possible without jeopardizing linearity. Here, you want to estimate a priori the range over which the probability of success goes from, say, 20% to 80%, then double that range and make all sets of test conditions unique and spaced roughly evenly over that range. Designing experiments with binary response is an extremely difficult problem. This crude procedure should get you something moderately close to an optimal design. Best Wishes, spencer graves (Ted Harding) wrote: On 02-Jul-05 Kerry Bush wrote: I have a very simple problem. When using glm to fit binary logistic regression model, sometimes I receive the following warning: Warning messages: 1: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, What does this output tell me? Since I only have 30 observations, i assume this is a small sample problem. It isn't. Spencer Graves has shown clearly with two examples that you can get a fit with no warnings with 3 observations, and a fit with warnings for 1000 observations. As he says, it arises when you get perfect separation with respect to the linear model. However, it may be worth expanding Spencer's explanation. With a single explanatory variable x (as in Spencer's examples), perfect separation occurs when y = 0 for all x = some x0, and y = 1 for all x x0. One of the parameters in the linear model is the scale parameter (i.e. the recipsorcal of the slope). If you express the model in the form logit(P(Y=1;x)) = (x - mu)/sigma then sigma is the scale parameter in question. As sigma - 0, P(Y=1;x) - 0 for x mu, and - 1 for x mu. Therefore, for any value of mu between x0 (at and below which all y=0 in your data) and x1 (the next larger value of x, at and above which all y=1), letting sigma - 0 gives a fit which perfectly predicts your y-values: it predicts P(Y=1) = 0, i.e. P(Y=0) = 1, for x mu, and predicts P(Y=1) = 1 for x mu; and this is exactly what you have observed in the data. So it's not a disaster -- in model-fitting terms, it is a resounding success! However, in real life one does not expect to be dealing with a situation where the outcomes are so perfectly predictable, and therefore one views such a result with due mistrust. One attributes the perfect separation not to perfect predictability, but to the possibility that, by chance, all the y=0 occur at lower values of x, and all the y=1 at higher values of x. Is it possible to fit this model in R with only 30 observations? Could any expert provide suggestions to avoid the warning? Yes! As Spencer showed, it is possible with 3 -- but of course it depends on the outcomes y. As to a suggestion to avoid the warning -- what you really need to avoid is data where the x-values are so sparse in the neighbourhood of the P(Y=1;x) = 0.5 area that it becomes likely that you will get y-values showing perfect separation. What that means in practice is that, over the range of x-values such that P(Y=1;x) rises from (say) 0.2 to 0.8 (chosen for illustration), you should have several x values in your data. Then the phenomonon of perfect separation becomes unlikely. But what that means in real life is that you need enough data, over the relevant range of x-values, to enable
[R] probability-probability plot
Hi, there. Is there any function in R to plot the probability-probability plot (PP plot)? Suppose I am testing some data against normal. Thanks. Yulei $$$ Yulei He 1586 Murfin Ave. Apt 37 Ann Arbor, MI 48105-3135 [EMAIL PROTECTED] 734-647-0305(H) 734-763-0421(O) 734-763-0427(O) 734-764-8263(fax) $$ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] plot question
dear list: in the following plot: plot(rnorm(10),rnorm(10),xlab=year,ylab=expression (paste('M x'*10^{3},)),font.lab=2) font.lab=2, but xlab and ylab are different. I want both labels in the same way. help? a.d. - Email Enviado utilizando o serviço MegaMail __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] probability-probability plot
On 07/02/05 15:41, Yulei He wrote: Hi, there. Is there any function in R to plot the probability-probability plot (PP plot)? Suppose I am testing some data against normal. qqnorm might be what you want, or lead to it. -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron R search page: http://finzi.psych.upenn.edu/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Specifying the minimum and maximum of x and y-axis in plot
see ?plot.default for example: plot(1:10,1:10,ylim=c(0,10),xlim=c(0,20)) - Original Message - From: Tolga Uzuner [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Saturday, July 02, 2005 9:39 PM Subject: [R] Specifying the minimum and maximum of x and y-axis in plot Hi, How do I specify the minimum and maximum of the x and y-axis when using plot ? Thanks, Tolga __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to set the position and size of the select.list window
Hello, I use select.list to obtain a window of select items, but how can I set the position and size of this window? Thank you! shengzhe __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] jagged array
Hi, I have a jagged array which looks like this: res ... [[996]] [1] 1.375 3.375 4.125 4.625 4.875 4.875 6.625 7.125 8.875 [[997]] [1] 1.875 5.125 6.875 7.875 9.875 [[998]] [1] 1.875 5.375 6.625 6.875 8.125 9.375 9.625 [[999]] [1] 1.875 6.875 9.875 [[1000]] [1] 1.875 6.125 6.875 9.375 9.625 Now, I want to use the first row,so I say res[1] res[1] [[1]] [1] 3.125 4.375 4.625 5.125 5.375 6.375 6.875 7.875 9.125 9.125 9.375 9.375 How do I access, an element within this ? I try res[1][1] [[1]] [1] 3.125 4.375 4.625 5.125 5.375 6.375 6.875 7.875 9.125 9.125 9.375 9.375 which returns the same vector.. ??? How do I get to 3.125 ? Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] jagged array
Please read An Introduction to R and the help file for [, where this is explained. If you want to use R, you need to expend effort to learn it. -- Bert Gunter -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tolga Uzuner Sent: Saturday, July 02, 2005 4:41 PM To: r-help@stat.math.ethz.ch Subject: [R] jagged array Hi, I have a jagged array which looks like this: res ... [[996]] [1] 1.375 3.375 4.125 4.625 4.875 4.875 6.625 7.125 8.875 [[997]] [1] 1.875 5.125 6.875 7.875 9.875 [[998]] [1] 1.875 5.375 6.625 6.875 8.125 9.375 9.625 [[999]] [1] 1.875 6.875 9.875 [[1000]] [1] 1.875 6.125 6.875 9.375 9.625 Now, I want to use the first row,so I say res[1] res[1] [[1]] [1] 3.125 4.375 4.625 5.125 5.375 6.375 6.875 7.875 9.125 9.125 9.375 9.375 How do I access, an element within this ? I try res[1][1] [[1]] [1] 3.125 4.375 4.625 5.125 5.375 6.375 6.875 7.875 9.125 9.125 9.375 9.375 which returns the same vector.. ??? How do I get to 3.125 ? Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] jagged array
Berton Gunter wrote: Please read An Introduction to R and the help file for [, where this is explained. If you want to use R, you need to expend effort to learn it. -- Bert Gunter -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tolga Uzuner Sent: Saturday, July 02, 2005 4:41 PM To: r-help@stat.math.ethz.ch Subject: [R] jagged array Hi, I have a jagged array which looks like this: res ... [[996]] [1] 1.375 3.375 4.125 4.625 4.875 4.875 6.625 7.125 8.875 [[997]] [1] 1.875 5.125 6.875 7.875 9.875 [[998]] [1] 1.875 5.375 6.625 6.875 8.125 9.375 9.625 [[999]] [1] 1.875 6.875 9.875 [[1000]] [1] 1.875 6.125 6.875 9.375 9.625 Now, I want to use the first row,so I say res[1] res[1] [[1]] [1] 3.125 4.375 4.625 5.125 5.375 6.375 6.875 7.875 9.125 9.125 9.375 9.375 How do I access, an element within this ? I try res[1][1] [[1]] [1] 3.125 4.375 4.625 5.125 5.375 6.375 6.875 7.875 9.125 9.125 9.375 9.375 which returns the same vector.. ??? How do I get to 3.125 ? Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html tx __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot question
Trying characters and expressions variously it seems that font.lab applies to character strings but not to expressions so if you want to use an expression just use bold (or whatever) explicitly on the expression. One gotcha is that bold will not work as one might have expected on numbers so they must be represented as character strings -- which is why we have used 3 rather than 3 below. plot(rnorm(10),rnorm(10),xlab=quote(bold(year)),ylab=quote(bold(Mx10^3))) On 7/2/05, alex diaz [EMAIL PROTECTED] wrote: dear list: in the following plot: plot(rnorm(10),rnorm(10),xlab=year,ylab=expression (paste('M x'*10^{3},)),font.lab=2) font.lab=2, but xlab and ylab are different. I want both labels in the same way. help? a.d. - Email Enviado utilizando o serviço MegaMail __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] jagged array
Access list items using [[. res[1] is a list with only one attribute. res[1][1] is the same list with only one attributre. res[[1]] is the first attribute of that list, so res[[1]][1] in your example should be 3.125. spencer graves Tolga Uzuner wrote: Berton Gunter wrote: Please read An Introduction to R and the help file for [, where this is explained. If you want to use R, you need to expend effort to learn it. -- Bert Gunter -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tolga Uzuner Sent: Saturday, July 02, 2005 4:41 PM To: r-help@stat.math.ethz.ch Subject: [R] jagged array Hi, I have a jagged array which looks like this: res ... [[996]] [1] 1.375 3.375 4.125 4.625 4.875 4.875 6.625 7.125 8.875 [[997]] [1] 1.875 5.125 6.875 7.875 9.875 [[998]] [1] 1.875 5.375 6.625 6.875 8.125 9.375 9.625 [[999]] [1] 1.875 6.875 9.875 [[1000]] [1] 1.875 6.125 6.875 9.375 9.625 Now, I want to use the first row,so I say res[1] res[1] [[1]] [1] 3.125 4.375 4.625 5.125 5.375 6.375 6.875 7.875 9.125 9.125 9.375 9.375 How do I access, an element within this ? I try res[1][1] [[1]] [1] 3.125 4.375 4.625 5.125 5.375 6.375 6.875 7.875 9.125 9.125 9.375 9.375 which returns the same vector.. ??? How do I get to 3.125 ? Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html tx __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [EMAIL PROTECTED] www.pdf.com http://www.pdf.com Tel: 408-938-4420 Fax: 408-280-7915 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] probability-probability plot
There are PP plots and QQ plots. I've tried PP plots and didn't get much from them. The normal QQ plot [qqnorm, in R], however, I use for all kinds of things. In a data mining situation, I'll get, e.g., 500 p.values, all uniformily distributed under the null hypothesis. I'll transform that null distribution to N(0, 1) with qnorm(p.values) and make qqnorm(qnorm(p.values), datax=TRUE). I may be chastised severely by Tukeyites for datax=TRUE, but the human eye can decode a 45 degree angle easier than any other angle other than horizontal or vertical. If you have a couple of mild outliers with a typical aspect ratio of the plot, the datax=TRUE option will make the line moderately close to 45 degrees. spencer graves Jonathan Baron wrote: On 07/02/05 15:41, Yulei He wrote: Hi, there. Is there any function in R to plot the probability-probability plot (PP plot)? Suppose I am testing some data against normal. qqnorm might be what you want, or lead to it. -- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [EMAIL PROTECTED] www.pdf.com http://www.pdf.com Tel: 408-938-4420 Fax: 408-280-7915 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html