Re: [R] Score Test Function
On Jun 12, 2011, at 07:54 , bill.venab...@csiro.au bill.venab...@csiro.au wrote: The score test looks at the effect of adding extra columns to the model matrix. The function glm.scoretest takes the fitted model object as the first argument and the extra column, or columns, as the second argument. Your x2 argument has length only 3. Is this really what you want? I would have expected you need to specify a vector of length nrow(DF), [as in the help information for glm.scoretest itself]. glm.scoretest will only do single-df tests, so it's not going to help here. Notice that the test requested is a whole-model test, i.e. a comparison of the fitted model with an intercept-only model (AKA a null model). It is not a goodness of fit test (which is a good thing as those are often dubious with binary responses). In R-devel, we can do score tests for such model comparisons as follows: mod2-glm(reading.recommendation~1,family=binomial,data=DF) anova(mod1,mod2,test=Rao) Analysis of Deviance Table Model 1: reading.recommendation ~ reading.score + gender Model 2: reading.recommendation ~ 1 Resid. Df Resid. Dev Df Deviance Rao Pr(Chi) 1 186 224.64 2 188 234.67 -2 -10.033 -9.53 0.008523 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 This is pretty close to the cited SAS result. I cannot tell where the .01 discrepancy creeps in, but the GLM algorithm is not 1-step convergent for the null model, even though the solution can be written down explicitly. (I don't have SAS to hand, but if anyone does, it would be interesting to see if it still says 9.5177 with the same data). With the current R, the closest you get is the asymptotically equivalent LRT: anova(mod1,mod2,test=Chisq) Analysis of Deviance Table Model 1: reading.recommendation ~ reading.score + gender Model 2: reading.recommendation ~ 1 Resid. Df Resid. Dev Df Deviance Pr(Chi) 1 186 224.64 2 188 234.67 -2 -10.033 0.006626 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to compute the P-value for a mixture of chi^2 distri
On 12-Jun-11 01:36:00, Thomas Lumley wrote: On Sun, Jun 12, 2011 at 12:44 PM, Tiago Pereira tiago.pere...@mbe.bio.br wrote: The test I am working on has an asymptotic 0.5*chi2(1)+0.5*chi2(2) distribution, where numbers inside parenthesis stand for the degrees of freedom. Is is possible to compute quickly in R the cumulative distribution of that distribution? There appear to be pchibar() functions in both the ibdreg and ic.infer packages that should do want you want. Simulation is also fairly efficient. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland Is there anything wrong with the following argument: In Tiago's notation, let X have the mixed chi2 distribution: X ~ chi2(1) with probability p X ~ chi2(2) with probability q = (1-p) Then the cumulative distribution of X is Prob(X = x) = p*Prob(chi2(1) = x) + q*prob(chi2(2) = x) so the CDF is p*pchisq(x,1) + (1-p)*pchisq(x,2) Inverting this to find the value of x for a given value P of Prob(X = x) = P is another matter, but should be reliably solvable by using R's uniroot() function, with lower and upper endpoints taken from whichever of qchisq(P,1), qchisq(P,2) is smallest, and whichever is largest. Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 12-Jun-11 Time: 09:37:46 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to compute the P-value for a mixture of chi^2 distributions
On Jun 12, 2011, at 03:36 , Thomas Lumley wrote: On Sun, Jun 12, 2011 at 12:44 PM, Tiago Pereira tiago.pere...@mbe.bio.br wrote: The test I am working on has an asymptotic 0.5*chi2(1)+0.5*chi2(2) distribution, where numbers inside parenthesis stand for the degrees of freedom. Is is possible to compute quickly in R the cumulative distribution of that distribution? There appear to be pchibar() functions in both the ibdreg and ic.infer packages that should do want you want. Simulation is also fairly efficient. Assuming that you mean a 50-50 mixture of the two, it should also work just to take the average of the two CDFs. The integral is a linear operator after all. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test if data uniformly distributed (newbie)
On Fri, Jun 10, 2011 at 10:15:36PM +0200, Kairavi Bhakta wrote: Thanks for your answer. The reason I want the data to be uniform: It's the first step in a machine learning project I am working on. If I know the data isn't uniformly distributed, then this means there is probably something wrong and the following steps will be biased by the non-uniform input data. I'm not checking an assumption for another statistical test. Actually, the data has been normalized because it is supposed to represent a probability distribution. That's why it sums to 1. My assumption is that, for a vector of 5, the data at that point should look like 0.20 0.20 0.20 0.20 0.20, but of course there is variation, and I would like to test whether the data comes close enough or not. As others told you, this is not the right format for KS test. The words testing uniformity can mean different things and the meaning depends on which statistical model you assume. If we have a random variable with values in [0, 1], then testing uniformity means to test, to which extent its distribution is close to the uniform distribution on [0, 1]. The numbers, which concentrate around 0.2, will not satisfy this. If we have a discrete variable with k values, for which we have m independent observations, and the number of observations of value i is m_i, then it is possible to test, whether the variable has the uniform distribution on {1, ..., k} using Chi-squared test. Note that for this test, the original counts are needed, not their normalized values, which sum up to 1. For example, if we have 20 observations and the counts (m_1, ..., m_5) are (4, 3, 5, 2, 6), then this is quite consistent with the assumption of uniform distribution. On the other hand, if we have 200 observations and the counts are (40, 30, 50, 20, 60), then the null hypothesis of uniform distribution may be rejected (the uniform distribution is the default, see argument p in ?chisq.test) x - c(40, 30, 50, 20, 60) chisq.test(x) Chi-squared test for given probabilities data: x X-squared = 25, df = 4, p-value = 5.031e-05 It is not clear, whether this is suitable for your application. If you generate the values in a different way, then another test may be needed. Can you specify more detail on how the numbers are generated? Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logistic regression where the independant variable is a ratio
Dear Lister, I have collected data in 6 geographical areas on prevalence of a parasite in humans and in foxes. The results are expressed as a number of positive or negative cases in human and foxes in the following data.frame: Pvtab - structure(list(posHum = c(3, 5, 3, 17, 0, 4), negHum = c(32631, 16293, 27988, 231282, 53215, 51046), posFox = c(18, 23, 18, 191, 12, 55), negFox = c(14, 24, 62, 105, 55, 43)), .Names = c(posHum, negHum, posFox, negFox), row.names = c(zone 1, zone 2, zone 3, zone 4, zone 5, zone 6), class = data.frame) I want to check a possible link between prevalences in humans (the reponse variable) and prevalences in foxes (the independant variable). I though about a logistic regression of the form: pvFox-Pvtab$posFox/(Pvtab$posFox+Pvtab$negFox) # computes the prevalence in foxes for each area mod0-mod0-glm(cbind(Pvtab$posHum,Pvtab$negHum)~pvFox,family=binomial) But in this cas the number of foxes that have been used to compute the prevalence estimate in foxes (pvFox) is deliberatly not taken into account in the model. I can hardly figure out how to do it (weighing the model with the square root of the number of fox in each area ?). Any advise appreciated about how to model a prevalence as a response of another prevalence at best. Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rcspline.plot query
The actual dataframe that I have imported on R is Death adp 1 0 58.00 2 1 18.70 3 0 21.75 4 1 25.35 5 0 20.55 6 1 28.05 7 0 50.15 8 1 31.25 9 1 32.75 10 1 28.95 11 1 15.10 12 0 45.05 13 1 19.95 14 0 32.95 15 0 22.60 16 0 10.75 17 0 41.80 18 0 27.05 19 1 26.25 20 0 34.40 21 1 24.65 22 1 42.30 23 0 19.80 24 0 47.20 25 1 25.90 26 1 30.70 27 1 28.60 28 1 25.80 29 0 27.05 30 0 14.40 31 0 28.40 32 0 48.45 33 0 17.85 34 1 30.85 35 1 24.75 36 0 16.20 37 0 34.10 38 0 12.00 39 0 24.40 40 0 69.50 41 1 36.45 42 0 41.55 43 1 17.80 44 0 40.10 45 1 21.35 46 1 22.90 47 0 63.80 48 1 45.80 49 0 70.65 50 0 54.00 51 0 26.90 52 0 50.75 53 0 28.20 54 1 23.25 55 1 18.10 56 0 29.20 57 1 52.80 58 0 46.60 59 1 32.55 60 0 74.50 The R command that I actually use is rcspline.plot (adp, Death, model=logistic, nk=3, knots=NULL) and the rcspline.eval (adp, nk=3, inclx=FALSE, knots.only=FALSE, type=ordinary) For both the commands I receive the same message : fewer than 6 non missing observations. George -- View this message in context: http://r.789695.n4.nabble.com/rcspline-plot-query-tp3590233p3591679.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using categorical variable in multiple regression
Hello, I wanted to do the multiple regression on categorical predictor data there's variable x1,x2,x3 and x3 is categorical one. so i just used as.factor(x3) and then ran multiple regression is it a good way to do the multiple regression on categorical predictor data? and how can I interpret the estimates? also if using as.factor is a good way, is there any difference with doing dummy coding for categorical variable and then running regression? Thanks. -- View this message in context: http://r.789695.n4.nabble.com/using-categorical-variable-in-multiple-regression-tp3591517p3591517.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] partial correlation
how can I compute partial correlation there's four variables; income, age, educational level, race. income is indenpendent variable and race is nominal variable I want to calculate partial correlation or semi-partial correlation between income race help me out... thanks. -- View this message in context: http://r.789695.n4.nabble.com/partial-correlation-tp3591520p3591520.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] welch anova and post-hoc
dear r community... it loks like i won't be able to reach homogenity of variance for my dataset, so i end up with welch anova instead of regular anova. documentation on this test is rather scarce, so maybe someone here can enlighten me a bit: - do i understand that no two-way implementation of the welch anova has been developed yet? - is there a post-hoc test for welch anovas implemented in R? thanks a lot, lukas kohl __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] partial correlation
A good start might be to go to http://rseek.org and search for partial correlation to get information on the packages and functions that offer it. From there, if you have specific questions, describing your data, what you've tried, what results you've gotten, and what results you expect would be most helpful. Your question is rather too vague for a more detailed answer. Sarah On Sun, Jun 12, 2011 at 2:46 AM, setlist yespupp...@hanmail.net wrote: how can I compute partial correlation there's four variables; income, age, educational level, race. income is indenpendent variable and race is nominal variable I want to calculate partial correlation or semi-partial correlation between income race help me out... thanks. -- -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] welch anova and post-hoc
On Jun 12, 2011, at 12:50 , eldor ado wrote: dear r community... it loks like i won't be able to reach homogenity of variance for my dataset, so i end up with welch anova instead of regular anova. documentation on this test is rather scarce, so maybe someone here can enlighten me a bit: - do i understand that no two-way implementation of the welch anova has been developed yet? Has the THEORY been developed? For complete data, I suppose you can formulate it as a multivariate contrast test and employ Huynh-Feldt correction, but it is not quite the obvious generalization. Otherwise, formulate as a mixed model with error variance depending on treatment, but the degree-of-freedom adjustments are not implemented in lme. - is there a post-hoc test for welch anovas implemented in R? Again, what does it mean? pairwise.t.test(..., pool.sd = FALSE, var.eq = FALSE) will do various pairwise comparison and allow simple corrections of the p value (Bonferroni and close relatives). thanks a lot, lukas kohl __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] snow package
Hi I try parallelising some code using the snow package and the following lines: cl - makeSOCKcluster(8) pfunc - function (x) (if(x = (-th)) 1 else 0) ###correlation coefficient clusterExport(cl,c(pfunc,th)) cor.c.f - parApply(cl,tms,c(1,2),FUN=pfunc) The parApply results in the error message: cor.c.f - parApply(cl,tms,c(1,2),FUN=pfunc) Error in do.call(fun, lapply(args, enquote)) : could not find function fun. Any ideas? Best wishes Kristian Helmholtz Zentrum M?nchen Deutsches Forschungszentrum f?r Gesundheit und Umwelt (GmbH) Ingolst?dter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir?in B?rbel Brumme-Bothe Gesch?ftsf?hrer: Prof. Dr. G?nther Wess und Dr. Nikolaus Blum Registergericht: Amtsgericht M?nchen HRB 6466 USt-IdNr: DE 129521671 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rcspline.plot query
Please see my previous note and please read the documentation. Also note the difference between a character string literal in quotes and the name of a vector, which is usually not quoted. Frank Chalikias George wrote: The actual dataframe that I have imported on R is Death adp 1 0 58.00 2 1 18.70 3 0 21.75 4 1 25.35 5 0 20.55 6 1 28.05 7 0 50.15 8 1 31.25 9 1 32.75 10 1 28.95 11 1 15.10 12 0 45.05 13 1 19.95 14 0 32.95 15 0 22.60 16 0 10.75 17 0 41.80 18 0 27.05 19 1 26.25 20 0 34.40 21 1 24.65 22 1 42.30 23 0 19.80 24 0 47.20 25 1 25.90 26 1 30.70 27 1 28.60 28 1 25.80 29 0 27.05 30 0 14.40 31 0 28.40 32 0 48.45 33 0 17.85 34 1 30.85 35 1 24.75 36 0 16.20 37 0 34.10 38 0 12.00 39 0 24.40 40 0 69.50 41 1 36.45 42 0 41.55 43 1 17.80 44 0 40.10 45 1 21.35 46 1 22.90 47 0 63.80 48 1 45.80 49 0 70.65 50 0 54.00 51 0 26.90 52 0 50.75 53 0 28.20 54 1 23.25 55 1 18.10 56 0 29.20 57 1 52.80 58 0 46.60 59 1 32.55 60 0 74.50 The R command that I actually use is rcspline.plot (adp, Death, model=logistic, nk=3, knots=NULL) and the rcspline.eval (adp, nk=3, inclx=FALSE, knots.only=FALSE, type=ordinary) For both the commands I receive the same message : fewer than 6 non missing observations. George - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/rcspline-plot-query-tp3590233p3591951.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Divide matrix into multiple smaller matrices
Hi, I still haven't found a solution for this problem. Is there a way in which I can slice the objects in c based on the info in h? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Divide-matrix-into-multiple-smaller-matrices-tp3552399p3591868.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using categorical variable in multiple regression
for categorical independent variables, regression models in R will generate dummy indicators based on your setting of contrasts (default contr.treatment). Use model.matrix(your model) to see how R does this internally. Weidong Gu On Sun, Jun 12, 2011 at 2:38 AM, setlist yespupp...@hanmail.net wrote: Hello, I wanted to do the multiple regression on categorical predictor data there's variable x1,x2,x3 and x3 is categorical one. so i just used as.factor(x3) and then ran multiple regression is it a good way to do the multiple regression on categorical predictor data? and how can I interpret the estimates? also if using as.factor is a good way, is there any difference with doing dummy coding for categorical variable and then running regression? Thanks. -- View this message in context: http://r.789695.n4.nabble.com/using-categorical-variable-in-multiple-regression-tp3591517p3591517.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] snow package
On Sun, 12 Jun 2011, Unger, Kristian, Dr. wrote: Hi I try parallelising some code using the snow package and the following lines: cl - makeSOCKcluster(8) pfunc - function (x) (if(x = (-th)) 1 else 0) ###correlation coefficient clusterExport(cl,c(pfunc,th)) cor.c.f - parApply(cl,tms,c(1,2),FUN=pfunc) The parApply results in the error message: cor.c.f - parApply(cl,tms,c(1,2),FUN=pfunc) Error in do.call(fun, lapply(args, enquote)) : could not find function fun. Any ideas? See the footer of this message: that example is not reproducible. What is 'th'? What is 'tms'? With some plausible guesses for those this works for me. The error message appears to be from snow's function docall(), but without even a traceback(), it is impossible to guess which call to docall() is involved. This is why we ask for a reproducible example. You are also missing the 'at a minimum' information requested in the posting guide. Best wishes Kristian Helmholtz Zentrum M?nchen Deutsches Forschungszentrum f?r Gesundheit und Umwelt (GmbH) Ingolst?dter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir?in B?rbel Brumme-Bothe Gesch?ftsf?hrer: Prof. Dr. G?nther Wess und Dr. Nikolaus Blum Registergericht: Amtsgericht M?nchen HRB 6466 USt-IdNr: DE 129521671 [[alternative HTML version deleted]] No HTML: seee the posting guide. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reshape:cast; error using ... in formula expression.
Yes, the basic problem is that you forgot to melt the data before trying to cast it. Hadley On Thursday, June 9, 2011, misterbray misterb...@gmail.com wrote: Dennis, doing some more research, and it seems you actually can include the ... term directly in the formula: cf. page 8 of http://www.had.co.nz/reshape/introduction.pdf (that article also explains why you might want to do so). It seems including the ... term only works, however, when your value column actually has the name value (e.g. using the value=my.val option yields the error). This was the bug that was catching me up yesterday. Thank you again Dennis, Yours, Rob -- View this message in context: http://r.789695.n4.nabble.com/Reshape-cast-error-using-in-formula-expression-tp3584721p3586597.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Linear model - coefficients
Dear R Users, Using lm() function with categorical variable R use contrasts. Let assume that I have one X independent variable with 3-levels. Because R estimate only 2 parameters ( e.g. a1, a2) the coef function returns only 2 estimators. Is there any function or trick to get another a3 values. I know that using contrast sum (?contr.sum) I could compute a3 = -(a1+a2). But I have many independent categorical variables and I'm looking for a fast solution. Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] snow package
Prof Ripley Thanks for your message - you are absolutely right. I should have provided more information. Meanwhile, I found out that loading an R workspace prevented parApply from working properly. The workspace contains many objects and I could not figure out which one was responsible for causing error. Obviously I had given an object a name that is already used for a function required by parApply. Many thanks for your help - and apologies for not following the R posting guidelines. Best wishes Kristian Unger Dr. Kristian Unger Arbeitsgruppenleiter Integrative Biologie / Head of Integrative Biology Group Abteilung für Strahlenzytogenetik / Research Unit of Radiation Cytogenetics Tel.: +49-89-3187-3515 Mob.: +49-160-90641879 Am 12.06.11 18:05 schrieb Prof Brian Ripley unter rip...@stats.ox.ac.uk: On Sun, 12 Jun 2011, Unger, Kristian, Dr. wrote: Hi I try parallelising some code using the snow package and the following lines: cl - makeSOCKcluster(8) pfunc - function (x) (if(x = (-th)) 1 else 0) ###correlation coefficient clusterExport(cl,c(pfunc,th)) cor.c.f - parApply(cl,tms,c(1,2),FUN=pfunc) The parApply results in the error message: cor.c.f - parApply(cl,tms,c(1,2),FUN=pfunc) Error in do.call(fun, lapply(args, enquote)) : could not find function fun. Any ideas? See the footer of this message: that example is not reproducible. What is 'th'? What is 'tms'? With some plausible guesses for those this works for me. The error message appears to be from snow's function docall(), but without even a traceback(), it is impossible to guess which call to docall() is involved. This is why we ask for a reproducible example. You are also missing the 'at a minimum' information requested in the posting guide. Best wishes Kristian Helmholtz Zentrum M?nchen Deutsches Forschungszentrum f?r Gesundheit und Umwelt (GmbH) Ingolst?dter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir?in B?rbel Brumme-Bothe Gesch?ftsf?hrer: Prof. Dr. G?nther Wess und Dr. Nikolaus Blum Registergericht: Amtsgericht M?nchen HRB 6466 USt-IdNr: DE 129521671 [[alternative HTML version deleted]] No HTML: seee the posting guide. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH) Ingolstädter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir´in Bärbel Brumme-Bothe Geschäftsführer: Prof. Dr. Günther Wess und Dr. Nikolaus Blum Registergericht: Amtsgericht München HRB 6466 USt-IdNr: DE 129521671 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear model - coefficients
?dummy.coef (NB: 'R' does as you tell it, and if you ask for the default contrasts you get coefficients a2 and a3, not a1 and a2. So perhaps you did something else and failed to tell us? And see the comment in ?dummy.coef about treatment contrasts.) On Sun, 12 Jun 2011, Robert Ruser wrote: Dear R Users, Using lm() function with categorical variable R use contrasts. Let assume that I have one X independent variable with 3-levels. Because R estimate only 2 parameters ( e.g. a1, a2) the coef function returns only 2 estimators. Is there any function or trick to get another a3 values. I know that using contrast sum (?contr.sum) I could compute a3 = -(a1+a2). But I have many independent categorical variables and I'm looking for a fast solution. Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] add a curve that fits the highest values on the plot.
Hi everyone, I am given two vectors - for example - x= c(0:20) x [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 and y [1] 19.4 17.9 8.1 11.3 7.8 8.0 5.0 1.7 3.9 5.4 7.5 5.4 4.7 5.0 4.9 [16] 3.5 2.9 2.4 1.4 1.7 plot(x,y,xlim=range(x),ylim=range(y)) How can I draw a curve that fits the peaks on the plot? Thank you :) Best, Nanami [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot probabilities of occurrence of values in a vector using connected points
Hi all, I have been struggling with some plotting issues (maybe not that difficult but I am stuck). If I have 2 vectors : 1st with average values of the columns of a data frame, which, looks similar to : v: (1.4,3,3,3,4.5,0,0,0,0,0,0,0,25.5,6,6,4.2) 2nd with the probability of occurrence of each element of the first vector calculated like this : prob[i]=v[i]*100/sum(v). I would like to plot points representing, on the x axis the values of v, and on the y axis the probabilities: a(prob[i],v[i]). Afterwards I want to connect them by a line and calculate the confidence intervals for this distribution. I tried to plot using the *points* function, but, somehow it shows a perfect linear distribution - and it shouldn't. It should look more like a density function. Any suggestions? Best, Andra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] statistical model with censored independent variable
hello: Does anyone know any R function which handles statisitcal model when the independent variable is censored? I know survival package does the analysis for censored dependnent variable. thanks -- View this message in context: http://r.789695.n4.nabble.com/statistical-model-with-censored-independent-variable-tp3592462p3592462.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] automatically generate the output name of my for loops
Hello R users, I am new to R and am having difficulty with the output name of my for loops. here is the problem: for (i in c(1:100)) { the name of the groups - which(k1$cluster==i) } how can it automatically generate the name for 100 cluster? what should i put in the bold letter place. really thank you for helping me Daniel -- View this message in context: http://r.789695.n4.nabble.com/automatically-generate-the-output-name-of-my-for-loops-tp3592160p3592160.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NLS fit for exponential distribution
Hello there, I am trying to fit an exponential fit using Least squares to some data. #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) sub - data.frame(x,y) #If model is y = a*exp(-x) + b then fit - nls(y ~ a*exp(-x) + b, data = sub, start = list(a = 0, b = 0), trace = TRUE) This works well. However, if I want to fit the model : y = a*exp(-mx)+c then I try - fit - nls(y ~ a*exp(-m*x) + b, data = sub, start = list(a = 0, b = 0, m= 0), trace = TRUE) It fails and I get the following error - Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates Any suggestions how I can fix this? Also next I want to try to fit a sum of 2 exponentials to this data. So the new model would be y = a*exp[(-m1+ m2)*x]+c . Any suggestion how I can do this... Any help would be most appreciated. Thanks in advance. Diviya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R graphs differ from exported one
Hello everybody! This is my first mail so I'll write a couple of lines of self-introduction. My name is Massimiliano, I'm from Italy and I'm studying Mathematical Engineering. I started using R in my Statistics course and have to use it to make a project which I'll discuss at the end of the course. The problem I'd like to ask you about follows. Suppose I have imported a datafile with the classic command dat - read.table('file', header=T) and wanted to see if my data are Normal-like or not. I can accomplish this with the command qqnorm (col) where 'col' is the column in the datafile 'file'. Now, the graph that appears is very nice: indeed it has a title, two axes with their labels and all the rest; but when I give commands postscript(file=plot.eps, onefile=FALSE) qqnorm (col) to save the graph to a file plot.eps to include it in a TeX, the file created has nothing to do with the former one: it only has the graph part, i.e. no title, no labels, no axes I searched in the documentation but found nothing; the same on the forum. What should I do? I'm running R 2.12.1 on Ubuntu Linux 10.04 LL 64-bit Thanks for help to everybody :) Massimiliano [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood ratio test
Hello there, I want to perform a likelihood ratio test to check if a single exponential or a sum of 2 exponentials provides the best fit to my data. I am new to R programming and I am not sure if there is a direct function for doing this and whats the best way to go about it? #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) data - data.frame(x,y) Specifically, I would like to test if the model1 or model2 provides the best fit to the data- model 1: y = a*exp(-m*x) + c model 2: y = a*exp(-(m1+m2)*x) + c Likelihood ratio test = L(data| model1)/ L(data | model2) Any help would be most appreciated. Thanks in advance. Diviya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotnetwork {spaa} : how to get an absolute interval (i.e. not based on the range of input data) ?
I'm using plotnetwork {spaa} in order to get a correlation network plot of my data (e.g.: http://www.oga-lab.net/RGM2/func.php?rd_id=spaa:plotnetwork). By default, 'interval' argument indicate the number of intervals by which the range of input data is partitioned (the number of partitions between the lowest and the highest value, unless I am mistaken). I would like to get intervals that are not sensitive to the range of the data but rather arbitrarily defined by myself (in order to get exactly the same intervals accros multiple samples, in reference to p critic values of Pearson table). May you know how to do it ? I hope this is not a stupid question !! (I'm a beginner in R). Thanks a lot. maxTC [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in NLS example in the documentation
Hello there, I am trying to use R function NLS to analyze my data and one of the examples in the documentation is - ## the nls() internal cheap guess for starting values can be sufficient: x - -(1:100)/10 y - 100 + 10 * exp(x / 2) + rnorm(x)/10 nlmod - nls(y ~ Const + A * exp(B * x), trace=TRUE) plot(x,y, main = nls(*), data, true function and fit, n=100) curve(100 + 10 * exp(x / 2), col=4, add = TRUE) lines(x, predict(nlmod), col=2) However, when I try to execute this - I get the following error- Error in nls(y ~ Const + A * exp(B * x), trace = TRUE) : step factor 0.000488281 reduced below 'minFactor' of 0.000976562 In addition: Warning message: In nls(y ~ Const + A * exp(B * x), trace = TRUE) : No starting values specified for some parameters Can someone propose how I can fix this error? Thanks. Diviya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Running a GMM Estimation on dynamic Panel Model using plm-Package
Hello, although I searched for a solution related to my problem I didn´t find one, yet. My skills in R aren´t very large, however. For my Diploma thesis I need to run a GMM estimation on a dynamic panel model using the pgmm - function in the plm-Package. The model I want to estimate is: Y(t) = Y(t-1) + X1(t) + X2(t) + X3(t) . There are no normal instruments in this model. There just should be the gmm-instruments I need for the model. In order to estimate it, I tried the following code: library(plm) test - pgmm(Y ~ lag(Y, 1) + X1 + X2 + X3 | lag(Y, 1), data=Model, effect=individual, model=onestep) I tried Model as Modelp - pdata.frame(... and as Model - read.table(... but in both cases there´s an error-massage: Error in solve.default(Reduce(+, A2)) : System ist für den Rechner singulär: reziproke Konditionszahl = 4.08048e-22 Error in solve.default(Reduce(+, A2)) : System is singulary for the computer: reciprocal number of conditions = 4.08048e-22 I really can´t help myself since the standard plm-estimations within or first difference are working well. I hope you understood what I´m trying to do and my description is adequate. Thank you very much! Kind regards. bStudent -- View this message in context: http://r.789695.n4.nabble.com/Running-a-GMM-Estimation-on-dynamic-Panel-Model-using-plm-Package-tp3592466p3592466.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test
Hi Diviya, Take a look at the lrtest function in the lmtest package: install.packages('lmtest) require(lmtest) ?lrtest HTH, Jorge On Sun, Jun 12, 2011 at 1:16 PM, Diviya Smith wrote: Hello there, I want to perform a likelihood ratio test to check if a single exponential or a sum of 2 exponentials provides the best fit to my data. I am new to R programming and I am not sure if there is a direct function for doing this and whats the best way to go about it? #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) data - data.frame(x,y) Specifically, I would like to test if the model1 or model2 provides the best fit to the data- model 1: y = a*exp(-m*x) + c model 2: y = a*exp(-(m1+m2)*x) + c Likelihood ratio test = L(data| model1)/ L(data | model2) Any help would be most appreciated. Thanks in advance. Diviya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Side by side scatter plots with specified regression lines
I am new and self taught in R, so please bear with me. I want to create two scatter plots side by side. The data set includes measurements from two different countries with 7 treatments over a timeline (x-axis). Problem 1 I want to have each plot to include the data from one of the countries with 7 regression lines of the treatments, but I do no know how to divide the data between them. This is how I created one plot with all the data. plot(YEAR,YIELD,col=red,xlab=Year,ylab=Yield,xlim=c(1,4),ylim=c(1,150)) Problem 2 The models I've found to describe the regression lines of the treatments seems to be different than the default ablines that R creates. I have the values of the exact values of intercepts and slopes, but does not know how to add them to the graph. This is what I got so far. abline(lm(YIELD[TREATMENT==A]~YEAR[TREATMENT==A]),lty=2,col=1) I hope this is enough to give me some pointers, otherwise I will try to elaborate. Thank you for your help. -- View this message in context: http://r.789695.n4.nabble.com/Side-by-side-scatter-plots-with-specified-regression-lines-tp3592473p3592473.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear model - coefficients
Prof. Ripley, thank you very much for the answer but wanted to get something else. There is an example and an explanation: options(contrasts=c(contr.sum,contr.poly)) # contr.sum uses ‘sum to zero contrasts’ Y - c(6,3,5,2,3,1,1,6,6,6,7,4,1,6,6,6,6,1) X - structure(list(x1 = c(2L, 3L, 1L, 3L, 3L, 2L, 1L, 1L, 3L, 2L, 3L, 2L, 1L, 1L, 2L, 1L, 2L, 3L), x2 = c(3L, 3L, 2L, 3L, 1L, 3L, 2L, 3L, 2L, 1L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L), x3 = c(1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L ), x4 = c(1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L), x5 = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 1L, 3L, 3L, 1L, 1L, 1L, 2L, 3L)), .Names = c(x1, x2, x3, x4, x5), row.names = c(NA, 18L), class = data.frame) reg - lm( Y ~ factor(X$x1) + factor(X$x2) + factor(X$x3) + factor(X$x4) + factor(X$x5) ) coef(reg) and e.g. I get two coefficients for variable x1 (3-levels variable) but I would like to get the third. Of course I can calculate a3= -(a1+a2) where a1 and a2 are coefficients of the variable x1. I hope that I manage to explain my problem. Robert 2011/6/12 Prof Brian Ripley rip...@stats.ox.ac.uk: ?dummy.coef (NB: 'R' does as you tell it, and if you ask for the default contrasts you get coefficients a2 and a3, not a1 and a2. So perhaps you did something else and failed to tell us? And see the comment in ?dummy.coef about treatment contrasts.) On Sun, 12 Jun 2011, Robert Ruser wrote: Dear R Users, Using lm() function with categorical variable R use contrasts. Let assume that I have one X independent variable with 3-levels. Because R estimate only 2 parameters ( e.g. a1, a2) the coef function returns only 2 estimators. Is there any function or trick to get another a3 values. I know that using contrast sum (?contr.sum) I could compute a3 = -(a1+a2). But I have many independent categorical variables and I'm looking for a fast solution. Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] add a curve that fits the highest values on the plot.
Hi Nanami, I am not sure exactly what you mean by fits the peaks...are any of these plots what you want? x - 1:20 y - c(19.4, 17.9, 8.1, 11.3, 7.8, 8, 5, 1.7, 3.9, 5.4, 7.5, 5.4, 4.7, 5, 4.9, 3.5, 2.9, 2.4, 1.4, 1.7) ## Find peaks index - which(c(NA, diff(sign(diff(y == -2) dev.new() layout(matrix(c(1, 3, 2, 3), 2)) ## Upper left plot plot(x, y, xlim = range(x), ylim = range(y)) ## Upper right plot plot(x, y, type = l) ## bottom plot (points) plot(x, y) ## add the lines connecting the peaks lines(x[index], y[index]) HTH, Josh On Sun, Jun 12, 2011 at 12:11 PM, ads pit deconstructed.morn...@gmail.com wrote: Hi everyone, I am given two vectors - for example - x= c(0:20) x [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 and y [1] 19.4 17.9 8.1 11.3 7.8 8.0 5.0 1.7 3.9 5.4 7.5 5.4 4.7 5.0 4.9 [16] 3.5 2.9 2.4 1.4 1.7 plot(x,y,xlim=range(x),ylim=range(y)) How can I draw a curve that fits the peaks on the plot? Thank you :) Best, Nanami [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear model - coefficients
Hi Robert, Try this: reg2 - lm( Y ~ factor(x1) + factor(x2) + factor(x3) + factor(x4) + factor(x5) - 1, data = X ) cof(ref2) HTH, Jorge On Sun, Jun 12, 2011 at 4:40 PM, Robert Ruser wrote: Prof. Ripley, thank you very much for the answer but wanted to get something else. There is an example and an explanation: options(contrasts=c(contr.sum,contr.poly)) # contr.sum uses sum to zero contrasts Y - c(6,3,5,2,3,1,1,6,6,6,7,4,1,6,6,6,6,1) X - structure(list(x1 = c(2L, 3L, 1L, 3L, 3L, 2L, 1L, 1L, 3L, 2L, 3L, 2L, 1L, 1L, 2L, 1L, 2L, 3L), x2 = c(3L, 3L, 2L, 3L, 1L, 3L, 2L, 3L, 2L, 1L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L), x3 = c(1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L ), x4 = c(1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L), x5 = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 1L, 3L, 3L, 1L, 1L, 1L, 2L, 3L)), .Names = c(x1, x2, x3, x4, x5), row.names = c(NA, 18L), class = data.frame) reg - lm( Y ~ factor(X$x1) + factor(X$x2) + factor(X$x3) + factor(X$x4) + factor(X$x5) ) coef(reg) and e.g. I get two coefficients for variable x1 (3-levels variable) but I would like to get the third. Of course I can calculate a3= -(a1+a2) where a1 and a2 are coefficients of the variable x1. I hope that I manage to explain my problem. Robert 2011/6/12 Prof Brian Ripley : ?dummy.coef (NB: 'R' does as you tell it, and if you ask for the default contrasts you get coefficients a2 and a3, not a1 and a2. So perhaps you did something else and failed to tell us? And see the comment in ?dummy.coef about treatment contrasts.) On Sun, 12 Jun 2011, Robert Ruser wrote: Dear R Users, Using lm() function with categorical variable R use contrasts. Let assume that I have one X independent variable with 3-levels. Because R estimate only 2 parameters ( e.g. a1, a2) the coef function returns only 2 estimators. Is there any function or trick to get another a3 values. I know that using contrast sum (?contr.sum) I could compute a3 = -(a1+a2). But I have many independent categorical variables and I'm looking for a fast solution. Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R graphs differ from exported one
Hmm, did you shut the device down afterward (i.e., call dev.off() )? I do not have any logic why that would induce the behavior you say you are getting, but this works just fine for me: postscript(tmp.eps, onefile = TRUE) qqnorm(rnorm(20)) dev.off() and creates the attached file (possibly not attached for the list, but you should get it). Josh On Sun, Jun 12, 2011 at 10:00 AM, Massimiliano massi.ly...@gmail.com wrote: Hello everybody! This is my first mail so I'll write a couple of lines of self-introduction. My name is Massimiliano, I'm from Italy and I'm studying Mathematical Engineering. I started using R in my Statistics course and have to use it to make a project which I'll discuss at the end of the course. The problem I'd like to ask you about follows. Suppose I have imported a datafile with the classic command dat - read.table('file', header=T) and wanted to see if my data are Normal-like or not. I can accomplish this with the command qqnorm (col) where 'col' is the column in the datafile 'file'. Now, the graph that appears is very nice: indeed it has a title, two axes with their labels and all the rest; but when I give commands postscript(file=plot.eps, onefile=FALSE) qqnorm (col) to save the graph to a file plot.eps to include it in a TeX, the file created has nothing to do with the former one: it only has the graph part, i.e. no title, no labels, no axes I searched in the documentation but found nothing; the same on the forum. What should I do? I'm running R 2.12.1 on Ubuntu Linux 10.04 LL 64-bit Thanks for help to everybody :) Massimiliano [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ tmp.eps Description: PostScript document __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear model - coefficients
Hi, but I want to get the coefficients for every variables from x1 to x5. (x1 was an example) Robert 2011/6/12 Jorge Ivan Velez jorgeivanve...@gmail.com: Hi Robert, Try this: reg2 - lm( Y ~ factor(x1) + factor(x2) + factor(x3) + factor(x4) + factor(x5) - 1, data = X ) cof(ref2) HTH, Jorge On Sun, Jun 12, 2011 at 4:40 PM, Robert Ruser wrote: Prof. Ripley, thank you very much for the answer but wanted to get something else. There is an example and an explanation: options(contrasts=c(contr.sum,contr.poly)) # contr.sum uses ‘sum to zero contrasts’ Y - c(6,3,5,2,3,1,1,6,6,6,7,4,1,6,6,6,6,1) X - structure(list(x1 = c(2L, 3L, 1L, 3L, 3L, 2L, 1L, 1L, 3L, 2L, 3L, 2L, 1L, 1L, 2L, 1L, 2L, 3L), x2 = c(3L, 3L, 2L, 3L, 1L, 3L, 2L, 3L, 2L, 1L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L), x3 = c(1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L ), x4 = c(1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L), x5 = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 1L, 3L, 3L, 1L, 1L, 1L, 2L, 3L)), .Names = c(x1, x2, x3, x4, x5), row.names = c(NA, 18L), class = data.frame) reg - lm( Y ~ factor(X$x1) + factor(X$x2) + factor(X$x3) + factor(X$x4) + factor(X$x5) ) coef(reg) and e.g. I get two coefficients for variable x1 (3-levels variable) but I would like to get the third. Of course I can calculate a3= -(a1+a2) where a1 and a2 are coefficients of the variable x1. I hope that I manage to explain my problem. Robert 2011/6/12 Prof Brian Ripley : ?dummy.coef (NB: 'R' does as you tell it, and if you ask for the default contrasts you get coefficients a2 and a3, not a1 and a2. So perhaps you did something else and failed to tell us? And see the comment in ?dummy.coef about treatment contrasts.) On Sun, 12 Jun 2011, Robert Ruser wrote: Dear R Users, Using lm() function with categorical variable R use contrasts. Let assume that I have one X independent variable with 3-levels. Because R estimate only 2 parameters ( e.g. a1, a2) the coef function returns only 2 estimators. Is there any function or trick to get another a3 values. I know that using contrast sum (?contr.sum) I could compute a3 = -(a1+a2). But I have many independent categorical variables and I'm looking for a fast solution. Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood ratio test
On Sun, 12 Jun 2011, Jorge Ivan Velez wrote: Hi Diviya, Take a look at the lrtest function in the lmtest package: install.packages('lmtest) require(lmtest) ?lrtest Yes, when you have to nls() fits, say m1 and m2, you can do lrtest(m1, m2) However, I don't think that both m1 and m2 can be identified in y = a * exp(-(m1+m2) * x) + c Unless I'm missing something only the sum (m1+m2) is identified anyway. Best, Z HTH, Jorge On Sun, Jun 12, 2011 at 1:16 PM, Diviya Smith wrote: Hello there, I want to perform a likelihood ratio test to check if a single exponential or a sum of 2 exponentials provides the best fit to my data. I am new to R programming and I am not sure if there is a direct function for doing this and whats the best way to go about it? #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) data - data.frame(x,y) Specifically, I would like to test if the model1 or model2 provides the best fit to the data- model 1: y = a*exp(-m*x) + c model 2: y = a*exp(-(m1+m2)*x) + c Likelihood ratio test = L(data| model1)/ L(data | model2) Any help would be most appreciated. Thanks in advance. Diviya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in NLS example in the documentation
On Jun 12, 2011, at 20:25 , Diviya Smith wrote: Hello there, I am trying to use R function NLS to analyze my data and one of the examples in the documentation is - ## the nls() internal cheap guess for starting values can be sufficient: x - -(1:100)/10 y - 100 + 10 * exp(x / 2) + rnorm(x)/10 nlmod - nls(y ~ Const + A * exp(B * x), trace=TRUE) plot(x,y, main = nls(*), data, true function and fit, n=100) curve(100 + 10 * exp(x / 2), col=4, add = TRUE) lines(x, predict(nlmod), col=2) However, when I try to execute this - I get the following error- Error in nls(y ~ Const + A * exp(B * x), trace = TRUE) : step factor 0.000488281 reduced below 'minFactor' of 0.000976562 In addition: Warning message: In nls(y ~ Const + A * exp(B * x), trace = TRUE) : No starting values specified for some parameters Can someone propose how I can fix this error? Thanks. Works for me. However, if you predefine some of the parameters, all bets are off, e.g. B - 1000 example(nls) Try again with a clean workspace. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NLS fit for exponential distribution
On Jun 12, 2011, at 18:57 , Diviya Smith wrote: Hello there, I am trying to fit an exponential fit using Least squares to some data. #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) sub - data.frame(x,y) #If model is y = a*exp(-x) + b then fit - nls(y ~ a*exp(-x) + b, data = sub, start = list(a = 0, b = 0), trace = TRUE) This works well. However, if I want to fit the model : y = a*exp(-mx)+c then I try - fit - nls(y ~ a*exp(-m*x) + b, data = sub, start = list(a = 0, b = 0, m= 0), trace = TRUE) It fails and I get the following error - Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates If a==0, then a*exp(-m*x) does not depend on m. So don't use a=0 as initial value. Any suggestions how I can fix this? Also next I want to try to fit a sum of 2 exponentials to this data. So the new model would be y = a*exp[(-m1+ m2)*x]+c . That's not a sum of exponentials. Did you mean a*(exp(-m1*x) + exp(-m2*x)) + c? Anyways, same procedure with more parameters. Just beware the fundamental exchangeability of m1 and m2, so don't initialize them to the same value. Any suggestion how I can do this... Any help would be most appreciated. Thanks in advance. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NLS fit for exponential distribution
On Sun, 12 Jun 2011, peter dalgaard wrote: On Jun 12, 2011, at 18:57 , Diviya Smith wrote: Hello there, I am trying to fit an exponential fit using Least squares to some data. #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) sub - data.frame(x,y) #If model is y = a*exp(-x) + b then fit - nls(y ~ a*exp(-x) + b, data = sub, start = list(a = 0, b = 0), trace = TRUE) This works well. However, if I want to fit the model : y = a*exp(-mx)+c then I try - fit - nls(y ~ a*exp(-m*x) + b, data = sub, start = list(a = 0, b = 0, m= 0), trace = TRUE) It fails and I get the following error - Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates If a==0, then a*exp(-m*x) does not depend on m. So don't use a=0 as initial value. Any suggestions how I can fix this? Also next I want to try to fit a sum of 2 exponentials to this data. So the new model would be y = a*exp[(-m1+ m2)*x]+c . That's not a sum of exponentials. Did you mean a*(exp(-m1*x) + exp(-m2*x)) + c? OK, that makes more sense. Also, scaling of the variables may help. Something like this could work: ## scaled data d - data.frame(x = c(1, 1:10 * 10), y = 100 * c( 0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626)) ## model fits n1 - nls(y ~ a*exp(-m * x) + b, data = d, start = list(a = 1, b = 0, m = 0.1)) n2 - nls(y ~ a * (exp(-m1 * x) + exp(-m2 * x)) + b, data = d, start = list(a = 1, b = 0, m1 = 0.1, m2 = 0.5)) ## visualization plot(y ~ x, data = d) lines(d$x, fitted(n1), col = 2) lines(d$x, fitted(n2), col = 4) ## ANOVA anova(n1, n2) ## LR test library(lmtest) lrtest(n1, n2) which both seem to indicate that n1 is sufficient. hth, Z Anyways, same procedure with more parameters. Just beware the fundamental exchangeability of m1 and m2, so don't initialize them to the same value. Any suggestion how I can do this... Any help would be most appreciated. Thanks in advance. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Pointers in R
Hello Everyone, I am new to R and would like to create a quad tree in R. However the problem is that I don't think R has pointers. Is there any way to create a tree in R? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running a GMM Estimation on dynamic Panel Model using plm-Package
Hi! Am 12.06.2011 21:43, schrieb bstudent: Error in solve.default(Reduce(+, A2)) : System ist für den Rechner singulär: reziproke Konditionszahl = 4.08048e-22 Error in solve.default(Reduce(+, A2)) : System is singulary for the computer: reciprocal number of conditions = 4.08048e-22 Just for the record: I had the same error with my data and finaly gave up and used stata. Kind regards and good luck! Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear model - coefficients
this may work. X-data.frame(sapply(X,function(x) as.factor(x))) reg3=lm(Y~.,data=X) dummy.coef(reg3) Weidong Gu On Sun, Jun 12, 2011 at 4:55 PM, Robert Ruser robert.ru...@gmail.com wrote: Hi, but I want to get the coefficients for every variables from x1 to x5. (x1 was an example) Robert 2011/6/12 Jorge Ivan Velez jorgeivanve...@gmail.com: Hi Robert, Try this: reg2 - lm( Y ~ factor(x1) + factor(x2) + factor(x3) + factor(x4) + factor(x5) - 1, data = X ) cof(ref2) HTH, Jorge On Sun, Jun 12, 2011 at 4:40 PM, Robert Ruser wrote: Prof. Ripley, thank you very much for the answer but wanted to get something else. There is an example and an explanation: options(contrasts=c(contr.sum,contr.poly)) # contr.sum uses ‘sum to zero contrasts’ Y - c(6,3,5,2,3,1,1,6,6,6,7,4,1,6,6,6,6,1) X - structure(list(x1 = c(2L, 3L, 1L, 3L, 3L, 2L, 1L, 1L, 3L, 2L, 3L, 2L, 1L, 1L, 2L, 1L, 2L, 3L), x2 = c(3L, 3L, 2L, 3L, 1L, 3L, 2L, 3L, 2L, 1L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L), x3 = c(1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L ), x4 = c(1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L), x5 = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 1L, 3L, 3L, 1L, 1L, 1L, 2L, 3L)), .Names = c(x1, x2, x3, x4, x5), row.names = c(NA, 18L), class = data.frame) reg - lm( Y ~ factor(X$x1) + factor(X$x2) + factor(X$x3) + factor(X$x4) + factor(X$x5) ) coef(reg) and e.g. I get two coefficients for variable x1 (3-levels variable) but I would like to get the third. Of course I can calculate a3= -(a1+a2) where a1 and a2 are coefficients of the variable x1. I hope that I manage to explain my problem. Robert 2011/6/12 Prof Brian Ripley : ?dummy.coef (NB: 'R' does as you tell it, and if you ask for the default contrasts you get coefficients a2 and a3, not a1 and a2. So perhaps you did something else and failed to tell us? And see the comment in ?dummy.coef about treatment contrasts.) On Sun, 12 Jun 2011, Robert Ruser wrote: Dear R Users, Using lm() function with categorical variable R use contrasts. Let assume that I have one X independent variable with 3-levels. Because R estimate only 2 parameters ( e.g. a1, a2) the coef function returns only 2 estimators. Is there any function or trick to get another a3 values. I know that using contrast sum (?contr.sum) I could compute a3 = -(a1+a2). But I have many independent categorical variables and I'm looking for a fast solution. Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NLS fit for exponential distribution
Hi: If you use RStudio, then you can use its manipulate package to figure out starting values for the model visually through sliders. Walmes Zeviani posted the template of how to do this last week; I've just adapted it for the models under consideration. It's interesting to play with this because it shows how strongly some of the parameters are correlated with one another...and it's fun :) Thanks to Walmes for the excellent tutorial example. Firstly, x and y (or the inputs in general) need to be in the global environment as vectors of the same length. [RStudio is a GUI, so I'm assuming that you run things from its command line.] Load the manipulate package (which comes with RStudio) and use the manipulate() function to create a plot of the data and fit a curve to it. The sliders adjust the parameter values and you play with them until the curve fits closely to the data. The range of the sliders should match the range of plausible values you think they might take. The output of the manipulate() function is a vector of starting values that you can pass into nls(), which is done by closing the window containing the sliders. # Model 1: x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) plot(x, y) # Initial plot of the data start - list() # Initialize an empty list for the starting values library(manipulate) manipulate( { plot(x, y) k - kk; b0 - b00; b1 - b10 curve(k*exp(-b1*x) + b0, add=TRUE) start - list(k=k, b0=b0, b1=b1) }, kk=slider(0.01, 0.2, step = 0.01, initial = 0.1), b10=slider(0.01, 0.4, step = 0.01, initial = 0.3), b00=slider(-0.01, -0.001, step=0.001,initial= -0.005)) # When done, start() is a list of named parameters in # the global environment # Model fit: fit1 - nls(y ~ k*exp(-b1*x) + b0, start = start) summary(fit1) ### Model 2: [following Peter Dalgaard's suggested model] ### Use the estimates from fit1 and shrink b1 to anticipate ### potential effect of b2; the initial estimate in the slider for ### b1 will be too big, so it needs to be shrunk (a lot :) start - list() manipulate( { plot(x, y) k - kk; b0 - b00; b1 - b10; b2 - b20 curve(k*(exp(-b1*x) + exp(-b2*x)) + b0, add=TRUE) start - list(k=k, b0=b0, b1=b1, b2 = b2) }, kk=slider(0.01, 0.1, step = 0.01, initial = 0.04), b10=slider(0.01, 0.4, step = 0.01, initial = 0.3), b20 = slider(0.01, 0.1, step = 0.005, initial = 0.05), b00=slider(-0.01, -0.001, step=0.001,initial= -0.004)) fit2 - nls(y ~ k*(exp(-b1*x) + exp(-b2*x)) + b0, start = start) summary(fit2) anova(fit1, fit2) HTH, Dennis On Sun, Jun 12, 2011 at 9:57 AM, Diviya Smith diviya.sm...@gmail.com wrote: Hello there, I am trying to fit an exponential fit using Least squares to some data. #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) sub - data.frame(x,y) #If model is y = a*exp(-x) + b then fit - nls(y ~ a*exp(-x) + b, data = sub, start = list(a = 0, b = 0), trace = TRUE) This works well. However, if I want to fit the model : y = a*exp(-mx)+c then I try - fit - nls(y ~ a*exp(-m*x) + b, data = sub, start = list(a = 0, b = 0, m= 0), trace = TRUE) It fails and I get the following error - Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates Any suggestions how I can fix this? Also next I want to try to fit a sum of 2 exponentials to this data. So the new model would be y = a*exp[(-m1+ m2)*x]+c . Any suggestion how I can do this... Any help would be most appreciated. Thanks in advance. Diviya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting Dataframes
Thanks very much. I did it using the do.call and compared it to my loop method and the results were exactly the same. In future i think i will use this function - a lot less typing and time consuming! -- View this message in context: http://r.789695.n4.nabble.com/Sorting-Dataframes-tp3580075p3592774.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pointers in R
Lists are recursive and heterogenous in R. Just assign the values to elements in lists and assign those lists to elements in other lists to build your tree. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Jaimin Dave davejaim...@gmail.com wrote: Hello Everyone, I am new to R and would like to create a quad tree in R. However the problem is that I don't think R has pointers. Is there any way to create a tree in R? Thanks [[alternative HTML version deleted]] _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NLS fit for exponential distribution
Thank you for very informative answers. This is great help. Peter, thanks for correcting the model. That was exactly what I meant - apologies for the typo. I was able to run the analysis on some simulated data and the subset of data that I had posted earlier. However, when I apply the analysis to full dataset, I usually get the following errors- fm2 - nls(y ~ b + a * (exp(m1 * -x) + exp(m2 * -x)), start = list(a = 1, b = 0, m1 = 0.1, m2 = 0.5), trace = TRUE) 84.42045 : 1.0 0.0 0.1 0.5 0.1593364 : 0.072399953 -0.000226868 0.101135423 0.571436093 0.09032837 : 7.897930e-02 7.905359e-06 1.271435e-01 1.872206e+00 0.06834495 : 0.0808842841 0.0005815865 0.1725302473 4.9048478439 Error in numericDeriv(form[[3L]], names(ind), env) : Missing value or an infinity produced when evaluating the model OR Error in nls(y ~ b + a * (exp(m1 * -x) + exp(m2 * -x)), start = list(a = 1, : number of iterations exceeded maximum of 50 Is there a way to optimize the start values? I was unable to try the RStudio solution as RStudio keeps crashing on my computer. Also is there any way to set the start value in NLS - I mean if I want to fit the model to only a subset of data starting at x = 10, how can I specify that? Thanks, Diviya On Sun, Jun 12, 2011 at 7:19 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: If you use RStudio, then you can use its manipulate package to figure out starting values for the model visually through sliders. Walmes Zeviani posted the template of how to do this last week; I've just adapted it for the models under consideration. It's interesting to play with this because it shows how strongly some of the parameters are correlated with one another...and it's fun :) Thanks to Walmes for the excellent tutorial example. Firstly, x and y (or the inputs in general) need to be in the global environment as vectors of the same length. [RStudio is a GUI, so I'm assuming that you run things from its command line.] Load the manipulate package (which comes with RStudio) and use the manipulate() function to create a plot of the data and fit a curve to it. The sliders adjust the parameter values and you play with them until the curve fits closely to the data. The range of the sliders should match the range of plausible values you think they might take. The output of the manipulate() function is a vector of starting values that you can pass into nls(), which is done by closing the window containing the sliders. # Model 1: x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) plot(x, y) # Initial plot of the data start - list() # Initialize an empty list for the starting values library(manipulate) manipulate( { plot(x, y) k - kk; b0 - b00; b1 - b10 curve(k*exp(-b1*x) + b0, add=TRUE) start - list(k=k, b0=b0, b1=b1) }, kk=slider(0.01, 0.2, step = 0.01, initial = 0.1), b10=slider(0.01, 0.4, step = 0.01, initial = 0.3), b00=slider(-0.01, -0.001, step=0.001,initial= -0.005)) # When done, start() is a list of named parameters in # the global environment # Model fit: fit1 - nls(y ~ k*exp(-b1*x) + b0, start = start) summary(fit1) ### Model 2: [following Peter Dalgaard's suggested model] ### Use the estimates from fit1 and shrink b1 to anticipate ### potential effect of b2; the initial estimate in the slider for ### b1 will be too big, so it needs to be shrunk (a lot :) start - list() manipulate( { plot(x, y) k - kk; b0 - b00; b1 - b10; b2 - b20 curve(k*(exp(-b1*x) + exp(-b2*x)) + b0, add=TRUE) start - list(k=k, b0=b0, b1=b1, b2 = b2) }, kk=slider(0.01, 0.1, step = 0.01, initial = 0.04), b10=slider(0.01, 0.4, step = 0.01, initial = 0.3), b20 = slider(0.01, 0.1, step = 0.005, initial = 0.05), b00=slider(-0.01, -0.001, step=0.001,initial= -0.004)) fit2 - nls(y ~ k*(exp(-b1*x) + exp(-b2*x)) + b0, start = start) summary(fit2) anova(fit1, fit2) HTH, Dennis On Sun, Jun 12, 2011 at 9:57 AM, Diviya Smith diviya.sm...@gmail.com wrote: Hello there, I am trying to fit an exponential fit using Least squares to some data. #data x - c(1 ,10, 20, 30, 40, 50, 60, 70, 80, 90, 100) y - c(0.033823, 0.014779, 0.004698, 0.001584, -0.002017, -0.003436, -0.06, -0.004626, -0.004626, -0.004626, -0.004626) sub - data.frame(x,y) #If model is y = a*exp(-x) + b then fit - nls(y ~ a*exp(-x) + b, data = sub, start = list(a = 0, b = 0), trace = TRUE) This works well. However, if I want to fit the model : y = a*exp(-mx)+c then I try - fit - nls(y ~ a*exp(-m*x) + b, data = sub, start = list(a = 0, b = 0, m= 0), trace = TRUE) It fails and I get the following error - Error in
Re: [R] Side by side scatter plots with specified regression lines
Dear Sigrid, At 12:46 PM -0700 6/12/11, Sigrid wrote: I am new and self taught in R, so please bear with me. I want to create two scatter plots side by side. The data set includes measurements from two different countries with 7 treatments over a timeline (x-axis). Problem 1 I want to have each plot to include the data from one of the countries with 7 regression lines of the treatments, but I do no know how to divide the data between them. This is how I created one plot with all the data. plot(YEAR,YIELD,col=red,xlab=Year,ylab=Yield,xlim=c(1,4),ylim=c(1,150)) Problem 2 The models I've found to describe the regression lines of the treatments seems to be different than the default ablines that R creates. I have the values of the exact values of intercepts and slopes, but does not know how to add them to the graph. This is what I got so far. abline(lm(YIELD[TREATMENT==A]~YEAR[TREATMENT==A]),lty=2,col=1) I hope this is enough to give me some pointers, otherwise I will try to elaborate. Thank you for your help. Here is an example that might help: library(psych) #in order to get the sat.act data set my.data - sat.act with(my.data,plot(SATV~SATQ, col=c(blue,red)[gender])) by(my.data,my.data$education, function(x) abline (lm(SATV~SATQ,data=x), lty=c(solid, dashed, dotted, dotdash, longdash, twodash)[(x$education+1)])) #to make two scatter plots side by side, use op - par(mfrow=c(1,2)) I hope this helps. Bill -- View this message in context: http://r.789695.n4.nabble.com/Side-by-side-scatter-plots-with-specified-regression-lines-tp3592473p3592473.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Somers Dyx
Hello R Community, I'm continuing to work through logistic regression (thanks for all the help on score test) and have come up against a new opposition. I'm trying to compute Somers Dyx as some suggest this is the preferred method to Somers Dxy (Demaris, 1992). I have searchered the [R] archieves to no avail for a function or code to compute Dyx (not Dxy). The overview of Hmisc has mention of Dyx for the rcorr.cens function but this appears to be a misprint because the manual states the function finds Dxy. Peng and So (1998) state that the Dyx is easily calculated in SAS (which tells me the same is possible for [R]). Yang, K., Miller, G. J., Miller, G. state that: (Tau-b)^2=Somers Dxy * Somers Dyx so maybe an approach would be to write a function that is: Somers Dyx-(Tau-b)^2/Somers Dxy I just don't want to waste time if this is incorrect logic and/or there's an easier way to calculate this thing; perhaps theres a golden function already created in an [R] package that I'm overlooking. Thanks in advance, Tyler [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] computer name
Is there an r function that will be able to identify the computer the code is running on? I have some common code that I run on several computers and each has a database with a different server name - although the content is identical. I need to set thisServer depending on which machine the code is running on... something like... if(pcname = pc1) thisServer = 'SERVER1' if(pcname = pc2) thisServer = 'SERVER2' conn - odbcDriverConnect(driver=SQL Server;database=x;server=thisServer;) ...rest of code will now run OK. I know I could set the DSN names the same and use... conn - odbcConnect(commonDSNname) but I was wondering if there was another way -- View this message in context: http://r.789695.n4.nabble.com/computer-name-tp3593120p3593120.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] automatically generate the output name of my for loops
?paste something like... paste (group, i, sep=_) jiliguala wrote: Hello R users, I am new to R and am having difficulty with the output name of my for loops. here is the problem: for (i in c(1:100)) { the name of the groups - which(k1$cluster==i) } how can it automatically generate the name for 100 cluster(just like group_1, group_2...)? what should i put in the bold letter place? really thank you for helping me Daniel -- View this message in context: http://r.789695.n4.nabble.com/automatically-generate-the-output-name-of-my-for-loops-tp3592160p3593124.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] automatically generate the output name of my for loops
?paste something like... paste (group, i, sep=_) jiliguala wrote: Hello R users, I am new to R and am having difficulty with the output name of my for loops. here is the problem: for (i in c(1:100)) { the name of the groups - which(k1$cluster==i) } how can it automatically generate the name for 100 cluster(just like group_1, group_2...)? what should i put in the bold letter place? really thank you for helping me Daniel -- View this message in context: http://r.789695.n4.nabble.com/automatically-generate-the-output-name-of-my-for-loops-tp3592160p3593123.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R graphs differ from exported one
Raptorista wrote: Now, the graph that appears is very nice: indeed it has a title, two axes with their labels and all the rest; but when I give commands postscript(file=plot.eps, onefile=FALSE) qqnorm (col) to save the graph to a file plot.eps to include it in a TeX, the file created has nothing to do with the former one: it only has the graph part, i.e. no title, no labels, no axes I use R under Windows, and I've seen the same sort of thing. I usually save graphs as PDF or PNG files, which works fine, but on the rare occasions I've tried to save graphs as Postscript, some of the graphs end up saving with bits missing. -- View this message in context: http://r.789695.n4.nabble.com/R-graphs-differ-from-exported-one-tp3592553p3592915.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] computer name
Not exactly R, but how about pcname - system('uname -n',intern=T) Tom On Sun, Jun 12, 2011 at 11:19 PM, pdb ph...@philbrierley.com wrote: Is there an r function that will be able to identify the computer the code is running on? I have some common code that I run on several computers and each has a database with a different server name - although the content is identical. I need to set thisServer depending on which machine the code is running on... something like... if(pcname = pc1) thisServer = 'SERVER1' if(pcname = pc2) thisServer = 'SERVER2' conn - odbcDriverConnect(driver=SQL Server;database=x;server=thisServer;) ...rest of code will now run OK. I know I could set the DSN names the same and use... conn - odbcConnect(commonDSNname) but I was wondering if there was another way -- View this message in context: http://r.789695.n4.nabble.com/computer-name-tp3593120p3593120.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear model - coefficients
Hi Weidong, thank you very much. It really works fine. Robert 2011/6/12 Weidong Gu anopheles...@gmail.com: this may work. X-data.frame(sapply(X,function(x) as.factor(x))) reg3=lm(Y~.,data=X) dummy.coef(reg3) Weidong Gu On Sun, Jun 12, 2011 at 4:55 PM, Robert Ruser robert.ru...@gmail.com wrote: Hi, but I want to get the coefficients for every variables from x1 to x5. (x1 was an example) Robert 2011/6/12 Jorge Ivan Velez jorgeivanve...@gmail.com: Hi Robert, Try this: reg2 - lm( Y ~ factor(x1) + factor(x2) + factor(x3) + factor(x4) + factor(x5) - 1, data = X ) cof(ref2) HTH, Jorge On Sun, Jun 12, 2011 at 4:40 PM, Robert Ruser wrote: Prof. Ripley, thank you very much for the answer but wanted to get something else. There is an example and an explanation: options(contrasts=c(contr.sum,contr.poly)) # contr.sum uses ‘sum to zero contrasts’ Y - c(6,3,5,2,3,1,1,6,6,6,7,4,1,6,6,6,6,1) X - structure(list(x1 = c(2L, 3L, 1L, 3L, 3L, 2L, 1L, 1L, 3L, 2L, 3L, 2L, 1L, 1L, 2L, 1L, 2L, 3L), x2 = c(3L, 3L, 2L, 3L, 1L, 3L, 2L, 3L, 2L, 1L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L), x3 = c(1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L ), x4 = c(1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L), x5 = c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 1L, 3L, 3L, 1L, 1L, 1L, 2L, 3L)), .Names = c(x1, x2, x3, x4, x5), row.names = c(NA, 18L), class = data.frame) reg - lm( Y ~ factor(X$x1) + factor(X$x2) + factor(X$x3) + factor(X$x4) + factor(X$x5) ) coef(reg) and e.g. I get two coefficients for variable x1 (3-levels variable) but I would like to get the third. Of course I can calculate a3= -(a1+a2) where a1 and a2 are coefficients of the variable x1. I hope that I manage to explain my problem. Robert 2011/6/12 Prof Brian Ripley : ?dummy.coef (NB: 'R' does as you tell it, and if you ask for the default contrasts you get coefficients a2 and a3, not a1 and a2. So perhaps you did something else and failed to tell us? And see the comment in ?dummy.coef about treatment contrasts.) On Sun, 12 Jun 2011, Robert Ruser wrote: Dear R Users, Using lm() function with categorical variable R use contrasts. Let assume that I have one X independent variable with 3-levels. Because R estimate only 2 parameters ( e.g. a1, a2) the coef function returns only 2 estimators. Is there any function or trick to get another a3 values. I know that using contrast sum (?contr.sum) I could compute a3 = -(a1+a2). But I have many independent categorical variables and I'm looking for a fast solution. Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] computer name
On 13/06/11 15:19, pdb wrote: Is there an r function that will be able to identify the computer the code is running on? I have some common code that I run on several computers and each has a database with a different server name - although the content is identical. I need to set thisServer depending on which machine the code is running on... something like... if(pcname = pc1) thisServer = 'SERVER1' if(pcname = pc2) thisServer = 'SERVER2' conn- odbcDriverConnect(driver=SQL Server;database=x;server=thisServer;) ...rest of code will now run OK. I know I could set the DSN names the same and use... conn- odbcConnect(commonDSNname) but I was wondering if there was another way -- View this message in context: http://r.789695.n4.nabble.com/computer-name-tp3593120p3593120.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Does Sys.info()[nodename] give you what you want? David Scott -- _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R graphs differ from exported one
Usually this happens when you forget to run dev.off(), as in that example. But we don't have the commented, minimal, self-contained, reproducible code. the posting guide and the footer of every R message asks for. On Sun, 12 Jun 2011, Mark Seeto wrote: Raptorista wrote: Now, the graph that appears is very nice: indeed it has a title, two axes with their labels and all the rest; but when I give commands postscript(file=plot.eps, onefile=FALSE) qqnorm (col) to save the graph to a file plot.eps to include it in a TeX, the file created has nothing to do with the former one: it only has the graph part, i.e. no title, no labels, no axes I use R under Windows, and I've seen the same sort of thing. I usually save graphs as PDF or PNG files, which works fine, but on the rare occasions I've tried to save graphs as Postscript, some of the graphs end up saving with bits missing. -- View this message in context: http://r.789695.n4.nabble.com/R-graphs-differ-from-exported-one-tp3592553p3592915.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.