[R] Significance test, Cronbach's Alpha
Hello: I am reviewing a paper at the moment where the author reports a Cronbach's Alpha and adds a significance test of the value. I was not aware that such a test exists. Does it? If so, how is it done in R? Tom ++ | Tom Backer Johnsen, Psychometrics Unit, Faculty of Psychology | | University of Bergen, Christies gt. 12, N-5015 Bergen, NORWAY | | Tel : +47-5558-9185Fax : +47-5558-9879 | | Email : [EMAIL PROTECTED]URL : http://www.galton.uib.no/ | ++ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Functions starting with underscores
Use backticks: attr(`_foo`, bar) - pow On 6/5/06, Joseph Wang [EMAIL PROTECTED] wrote: I'm having problems with functions starting with underscores '_foo' - function(x) {1} seems to work but I can't assign an attribute to this function attr('_foo', 'bar') - 'pow' Any way of doing this? This is for a C++ - R wrapping system so I'd like to keep the C++ names which start with underscores. (please cc: responses to me personally) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Functions starting with underscores
It is not allowed to start a variable name with an underscore. So, you must use ` to call this non-conventional name: '_foo' - function(x) 1 attr('_foo', 'bar') - 'pow' Error: target of assignment expands to non-language object attr(`_foo`, 'bar') - 'pow' '_foo' [1] _foo `_foo` function(x) 1 attr(,bar) [1] pow Best, Philippe Grosjean Joseph Wang wrote: I'm having problems with functions starting with underscores '_foo' - function(x) {1} seems to work but I can't assign an attribute to this function attr('_foo', 'bar') - 'pow' Any way of doing this? This is for a C++ - R wrapping system so I'd like to keep the C++ names which start with underscores. (please cc: responses to me personally) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] understanding the verbose output in nlme
Thanks for the reply...will give some thought to your suggestion about stepping through the function. I have read the Pinheiro and Bates book, in fact its my primary reference for getting into the nonlinear mixed models with R. Lastly wrt the bit under subjects 1-6, I had thought about it being an estimated random effect but in this model there are 2 random effects so not sure if that holds... thanks again... - Original Message - From: Spencer Graves [EMAIL PROTECTED] To: Greg Distiller [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Sunday, June 04, 2006 8:49 PM Subject: Re: [R] understanding the verbose output in nlme I don't know, but if it were my question, I think I could find out by making local copies of the functions involved and stepping through the algorithm line by line using debug (see, e.g., http://finzi.psych.upenn.edu/R/Rhelp02a/archive/68215.html;). Have you read Pinheiro and Bates (2000) Mixed-Effects Models in S and S-Plus? If no, I encourage you to do so. Over the past 4 years or so, I've probably spent more time with this book and referred more people to it than any other. Doug Bates is a leading original contributor in this area, and I believe you will find this book well worth your money and your time. Regarding the numbers under subjectno1-6, I'm guessing that these may be the current estimates of the random effects for the first 6 of the 103 subjects. The purpose of verbose is NOT to dump everything but only enough to help you evaluate whether the algorithm seems to be converging. hope this helps. Spencer Graves Greg Distiller wrote: Hi I have found some postings referring to the fact that one can try and understand why a particular model is failing to solve/converge from the verbose output one can generate when fitting a nonlinear mixed model. I am trying to understand this output and have not been able to find out much: **Iteration 1 LME step: Loglik: -237.4517 , nlm iterations: 22 reStruct parameters: subjectno1 subjectno2 subjectno3 subjectno4 subjectno5 subjectno6 -0.87239181 2.75772772 -0.72892919 -10.36636391 0.55290322 0.09878685 PNLS step: RSS = 60.50164 fixed effects:2.59129 0.00741764 0.57155 iterations: 7 Convergence: fixed reStruct 5.740688 2.159285 I know that the Loglik must refer to the value of the log likelihood function, that the values after fixed effects are the parameter estimates, and that the bit after Convergence obviously has something to so with the convergence criteria for the fixed effects and the random effects structure. I did manage to find a posting where somebody said that the restruct parameter is the log of the relative precision of the random effects? The one thing that is a bit confusing to me is that it appears as if the fixed effects convergence must be zero (or close to it) as one would expect but in one of my converged models the output showed a restruct value of 0.72 ? Then I have no idea what the numbers under subjectno1-6 are, especially as I have 103 subjects in the data! Can anyone help shed some light on this output and how it can be used to diagnose issues with a model? Many thanks Greg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Significance test, Cronbach's Alpha
This test can be meaningfull, but only in rare circumstancies (are the items (or raters) more consistent than what could be expected by chance?). I would do it with R using a bootstrap procedure (Efron Tibshirani, An introduction to the bootstrap, p.156). Best regards, Bruno PS : I am not really a specialist... Bruno Falissard INSERM U669, PSIGIAM Paris Sud Innovation Group in Adolescent Mental Health Maison de Solenn 97 Boulevard de Port Royal 75679 Paris cedex 14, France tel : (+33) 6 81 82 70 76 fax : (+33) 1 45 59 34 18 web site : http://perso.wanadoo.fr/bruno.falissard/ -Message d'origine- De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] De la part de Tom Backer Johnsen Envoyé : lundi 5 juin 2006 08:05 À : r-help@stat.math.ethz.ch Objet : [R] Significance test, Cronbach's Alpha Hello: I am reviewing a paper at the moment where the author reports a Cronbach's Alpha and adds a significance test of the value. I was not aware that such a test exists. Does it? If so, how is it done in R? Tom ++ | Tom Backer Johnsen, Psychometrics Unit, Faculty of Psychology | | University of Bergen, Christies gt. 12, N-5015 Bergen, NORWAY | | Tel : +47-5558-9185Fax : +47-5558-9879 | | Email : [EMAIL PROTECTED]URL : http://www.galton.uib.no/ | ++ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem with mle
Hi Ben Thanks a lot for your help - it solved my problem and I understand a little bit more. Just one more question along this line: If I want to use another likelihood estimator, e.g. negative binominal, I guess I have to use -sum(dpois(SumSeeds,size=???, prob=???,log=TRUE)) instead of -sum(dnbinom(SumSeeds,lambda=est,log=TRUE)) But what do I use for size and prob? Rainer -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel:+27 - (0)72 808 2975 (w) Fax:+27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Calculation of AIC BIC from mle
R 2.3.0, all packages up to date Linux, SuSE 10.0 Hi I want to calculate AIC or BIC from several results from mle calculation. I found the AIC function, but it does not seem to work with objects of class mle - If I execute the following: ml1 - mle(...) AIC(ml1) I get the following error messale: Error in logLik(object) : no applicable method for logLik Therefore I am using the following to calculate the AIC: #AICmle calculate AIC from mle object AICmle - function( m, k=2) { lL - logLik(m) edf - attr(lL, df) LL - lL[1] - 2 * LL + k * edf } 1) Why is AIC not working with objects of class mle - am I doing something wrong, is it a bug or by design? 2) Just for confirmation - is my calculation of AIC correct? Thanks Rainer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to call a value labels attribute?
HeinzT == Heinz Tuechler [EMAIL PROTECTED] on Sun, 04 Jun 2006 13:07:22 +0100 writes: HeinzT At 14:12 03.06.2006 +0200, Martin Maechler wrote: Heinz == Heinz Tuechler [EMAIL PROTECTED] on Tue, 23 May 2006 01:17:21 +0100 writes: Heinz Dear All, after searching on CRAN I got the Heinz impression that there is no standard way in R to Heinz label values of a numerical variable. Hmm, there's names(.) and names(.) - .. Why are those not sufficient? x - 1:3 names(x) - c(apple, banana, NA) HeinzT Martin, HeinzT I will considere this. For now I am using an HeinzT attribute value.labels and a corresponding class to HeinzT preserve this and other attributes after inclusion HeinzT in a data.frame and indexing/subsetting, but using HeinzT names should do as well. My idea was more like HeinzT defining a set of value labels for a variable and HeinzT apply it to all the variable, as e.g. in the HeinzT following _pseudocode_: HeinzT ### not run HeinzT ### pseudocode HeinzT x - c(1, 2, 3, 3, 2, 3, 1) HeinzT value.labels(x) - c(apple=1, banana=2, NA=3) HeinzT x HeinzT ### desired result HeinzT apple banana NA NA banana NA apple HeinzT 1 2 3 3 2 3 1 HeinzT value.labels(x) - c(Apfel=1, Banane=2, Birne=3) # redefine labels HeinzT x HeinzT ### desired result HeinzT Apfel Banane Birne Birne Banane Birne Apfel HeinzT 1 2 3 3 2 3 1 HeinzT value.labels(x) # inspect labels HeinzT ### desired result HeinzT Apfel Banane Birne HeinzT 1 2 3 HeinzT These value.labels should persist even after HeinzT inclusion in a data.frame and after HeinzT indexing/subsetting. As far as I can see, factor()s and their levels/labels provide all that. HeinzT I did not yet try your idea concerning these HeinzT aspects, but I will do it. My final goal is to do HeinzT all the data handling on numerically coded variables HeinzT and to transform to factors on the fly when needed HeinzT for statistical procedures. Given the presence of HeinzT value.labels a factor function could use them for HeinzT the conversion. HeinzT I described my motivation for all this in a previous post, titled: HeinzT How to represent a metric categorical variable? HeinzT There was no response at all and I wonder, if this is such a rare problem. Probably. I don't see why this can't be done by (ordered) factors. But I'm really not particularly expert nor interested here; I just wanted to make sure you didn't overlook the obvious. Martin HeinzT Thanks, HeinzT Heinz Heinz Since this Heinz would be useful for me I intend to create such an Heinz attribute, at the moment for my personal use. Still Heinz I would like to choose a name which does not conflict Heinz with names of commonly used attributes. Heinz Would value.labels or vallabs create conflicts? Heinz The attribute should be structured as data.frame with Heinz two columns, levels (numeric) and labels Heinz (character). These could then also be used to Heinz transform from numeric to factor. If the attribute is Heinz copied to the factor variable it could also serve to Heinz retransform the factor to the original numerical Heinz variable. Heinz Comments? Ideas? Heinz Thanks Heinz Heinz Tüchler Heinz __ Heinz R-help@stat.math.ethz.ch mailing list Heinz https://stat.ethz.ch/mailman/listinfo/r-help PLEASE Heinz do read the posting guide! Heinz http://www.R-project.org/posting-guide.html HeinzT == Heinz Tuechler [EMAIL PROTECTED] on Sun, 04 Jun 2006 13:07:22 +0100 writes: HeinzT At 14:12 03.06.2006 +0200, Martin Maechler wrote: Heinz == Heinz Tuechler [EMAIL PROTECTED] on Tue, 23 May 2006 01:17:21 +0100 writes: Heinz Dear All, after searching on CRAN I got the Heinz impression that there is no standard way in R to Heinz label values of a numerical variable. Hmm, there's names(.) and names(.) - .. Why are those not sufficient? x - 1:3 names(x) - c(apple, banana, NA) HeinzT Martin, HeinzT I will considere this. For now I am using an HeinzT attribute value.labels and a corresponding class to HeinzT preserve this and other attributes after inclusion HeinzT in a data.frame and indexing/subsetting, but using HeinzT names should do as well. My idea was more like HeinzT defining a set of value labels for a variable and HeinzT apply it to all the variable, as e.g. in the HeinzT following _pseudocode_: HeinzT ### not run ### pseudocode x - c(1, 2, 3, 3, 2, 3, HeinzT 1) value.labels(x) - c(apple=1, banana=2, NA=3) x HeinzT ### desired
[R] [R-pkgs] new package QCAGUI
Dear list members, I'm pleased to let you know there's a new package on CRAN called QCAGUI, a graphical user interface for the QCA package. This is a stripped down version of John Fox's Rcmdr package, plus a couple of menus for QCA. My thanks to John Fox for his encouragement and advice. Regards, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Significance test, Cronbach's Alpha
Tom Backer Johnsen wrote: Hello: I am reviewing a paper at the moment where the author reports a Cronbach's Alpha and adds a significance test of the value. I was not aware that such a test exists. Does it? If so, how is it done in R? Hi Tom, This may be due to the fact that some interpret Cronbach's alpha as a correlation between items, thus encouraging the unwary to assume that the probability of a numerically equivalent correlation can be used as a test of significance. Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to call a value labels attribute?
Dimitris, thank you. I have to considere all the responses to this question and then your functions may prove to be useful. Heinz At 17:01 04.06.2006 +0200, Dimitrios Rizopoulos wrote: maybe you could consider something like the following: varlabs - function(x){ if (is.null(names(x))) NULL else x[!duplicated(x)] } varlabs- - function(x, value){ names(x) - names(value[x]) x } ### x - c(1, 2, 3, 3, 2, 3, 1) x varlabs(x) varlabs(x) - c(apple=1, banana=2, NA=3) x varlabs(x) varlabs(x) - c(Apfel=1, Banane=2, Birne=3) x varlabs(x) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm Quoting Heinz Tuechler [EMAIL PROTECTED]: At 14:12 03.06.2006 +0200, Martin Maechler wrote: Heinz == Heinz Tuechler [EMAIL PROTECTED] on Tue, 23 May 2006 01:17:21 +0100 writes: Heinz Dear All, after searching on CRAN I got the Heinz impression that there is no standard way in R to Heinz label values of a numerical variable. Hmm, there's names(.) and names(.) - .. Why are those not sufficient? x - 1:3 names(x) - c(apple, banana, NA) Martin, I will considere this. For now I am using an attribute value.labels and a corresponding class to preserve this and other attributes after inclusion in a data.frame and indexing/subsetting, but using names should do as well. My idea was more like defining a set of value labels for a variable and apply it to all the variable, as e.g. in the following _pseudocode_: ### not run ### pseudocode x - c(1, 2, 3, 3, 2, 3, 1) value.labels(x) - c(apple=1, banana=2, NA=3) x ### desired result apple banana NA NA banana NA apple 1 2 3 3 2 3 1 value.labels(x) - c(Apfel=1, Banane=2, Birne=3) # redefine labels x ### desired result Apfel Banane Birne Birne Banane Birne Apfel 1 2 3 3 2 3 1 value.labels(x) # inspect labels ### desired result Apfel Banane Birne 1 2 3 These value.labels should persist even after inclusion in a data.frame and after indexing/subsetting. I did not yet try your idea concerning these aspects, but I will do it. My final goal is to do all the data handling on numerically coded variables and to transform to factors on the fly when needed for statistical procedures. Given the presence of value.labels a factor function could use them for the conversion. I described my motivation for all this in a previous post, titled: How to represent a metric categorical variable? There was no response at all and I wonder, if this is such a rare problem. Thanks, Heinz Heinz Since this Heinz would be useful for me I intend to create such an Heinz attribute, at the moment for my personal use. Still Heinz I would like to choose a name which does not conflict Heinz with names of commonly used attributes. Heinz Would value.labels or vallabs create conflicts? Heinz The attribute should be structured as data.frame with Heinz two columns, levels (numeric) and labels Heinz (character). These could then also be used to Heinz transform from numeric to factor. If the attribute is Heinz copied to the factor variable it could also serve to Heinz retransform the factor to the original numerical Heinz variable. Heinz Comments? Ideas? Heinz Thanks Heinz Heinz Tüchler Heinz __ Heinz R-help@stat.math.ethz.ch mailing list Heinz https://stat.ethz.ch/mailman/listinfo/r-help PLEASE Heinz do read the posting guide! Heinz http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to call a value labels attribute?
Richard, Martin, the example is not ideal, I see. I was strating from the question, how to represent a metric categorical variable. By metric categorical variable I intend a variable, which has only few distinct values and an inherent metric. An example would be a risk score, which classifies patients in several groups like low, intermediate, high, extreme with corresponding risk estimates of 0, 1, 2, 5.5. Other examples could be items in a questionnaire. These items often have numerical values that may range from -5 to 5. In some cases, like tables and box-plots these scores/items should be treated like a factor (with labelled values), in other cases like cox-regression or when forming an overall score they should be treated like numeric variables. I was asking for a convenient way to represent a variable like this, but there was no response. The crucial point is that the variables should retain their numerical values and their value labels. Without value labels they could be defined as factor and used or directly or by as.numeric(), because the levels still represent the numerical values, but as soon as labels are used, the original numerical values get lost. Thanks, Heinz Tüchler At 12:12 04.06.2006 -0400, Richard M. Heiberger wrote: How is what you are doing any different from factors? x - factor(c(1, 2, 3, 3, 2, 3, 1), labels=c(apple, banana, other)) x [1] apple banana other other banana other apple Levels: apple banana other as.numeric(x) [1] 1 2 3 3 2 3 1 levels(x)[3] - birne x [1] apple banana birne birne banana birne apple Levels: apple banana birne Original message ### not run ### pseudocode x - c(1, 2, 3, 3, 2, 3, 1) value.labels(x) - c(apple=1, banana=2, NA=3) x ### desired result apple banana NA NA banana NA apple 1 2 3 3 2 3 1 value.labels(x) - c(Apfel=1, Banane=2, Birne=3) # redefine labels x ### desired result Apfel Banane Birne Birne Banane Birne Apfel 1 2 3 3 2 3 1 value.labels(x) # inspect labels ### desired result Apfel Banane Birne 1 2 3 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nested design
Jorn, For your model, model-lme(Biomass~Age,random=~1|Age/Stand) think about nesting age in stand (doesn't that makes more sense, anyway ?). If you're lucky the NaN will zip. So, do model - lme( Biomass ~ Age, random = ~ 1 | Stand/Age I've had a similar problem with unbalanced data when I've tried to nest Season%in%Site (which is what I wanted), with Site as a fixed effect as well. Then, like you, I've got NaN's for my Site(s). Changing the nesting to Site%in%Season has fixed the problem. However, random effects are now being specified somewhat differently. Hopefully a statistician will come in and tell you and me quite what the difference is, because I get lost in the logic of it. Note that lmer() copes with the problematic Season%in%Site nesting without problems, using the same dataset. Regards, Mark Mark DiffordPh.D. candidate, Botany Department, Nelson Mandela Metropolitan University, Port Elizabeth, SA. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] strange (to me) ncolumns default value in 'write'
AntonioF == Antonio, Fabio Di Narzo [EMAIL PROTECTED] on Sun, 4 Jun 2006 18:42:35 +0200 writes: AntonioF Hi all. AntonioF In 'write' documentation, one can read: AntonioF write(x, file = data, AntonioF ncolumns = if(is.character(x)) 1 else 5, AntonioF append = FALSE, sep = ) AntonioF Now my question is: why the default value of 1column for character vectors AntonioF and the special value '5' for non character vectors? well, most of us have five fingers per hand, so many consider '5' to be a 'nice' number :-) For R, the main reason of using this default, was that S already had it. And S has had it not only for the above reason, but I guess also because nicely, 5 * 16 = 80, and 80 used to be the customary terminal width in the 70s and 80s, i.e., the good old times S was created. AntonioF Nice sunday to all, and Pentecoste Monday, for those with an extra free day like me. Martin Maechler, ETH Zurich AntonioF Antonio, Fabio Di Narzo. AntonioF Note: this is just curiosity. Don't read as: WHY? WHY!?!?, but just as AntonioF why? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] slanted ends of horizontal lines for certain line widths
Charles == Charles Annis, P E [EMAIL PROTECTED] on Sun, 4 Jun 2006 11:43:23 -0400 writes: Charles I think you want to change par()$lend Charles Type Charles par() enter Charles to see the defaults. Actually, that's one place, where using str(.) is particularly beneficial: Use str(par()) Charles [] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Calculation of AIC BIC from mle
Rainer == Rainer M KRug [EMAIL PROTECTED] on Mon, 05 Jun 2006 10:22:02 +0200 writes: Rainer R 2.3.0, all packages up to date Rainer Linux, SuSE 10.0 Rainer Hi Rainer I want to calculate AIC or BIC from several results from mle calculation. Rainer I found the AIC function, but it does not seem to work with objects of Rainer class mle - Rainer If I execute the following: Rainer ml1 - mle(...) Rainer AIC(ml1) Rainer I get the following error messale: Rainer Error in logLik(object) : no applicable method for logLik which is really not helpful as error message, since it is *wrong*: There is a logLik(.) method for mle objects, and it even works. There's some embarassing bug here, since if you quickly look at the 'stats4' package source, it's very clear that it was designed to have AIC(), BIC(), logLik() all working, but only the last one does. Rainer Therefore I am using the following to calculate the AIC: Rainer #AICmle calculate AIC from mle object Rainer AICmle - function( m, k=2) Rainer { Rainer lL - logLik(m) Rainer edf - attr(lL, df) Rainer LL - lL[1] Rainer - 2 * LL + k * edf Rainer } Rainer 1) Why is AIC not working with objects of class mle - am I doing Rainer something wrong, is it a bug or by design? a bug. Rainer 2) Just for confirmation - is my calculation of AIC correct? it looks so, but I didn't check. Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] random sample from a file
Dear All, Thanks to all that responded to my earlier query. I have a file with two columns: an ID field and a Data field. I wish to extract multiple random samples of 20 numbers from this file in such a way that the ID field is extracted with the data as well. At the moment I am using the 'sample' function but this only extracts the data without the IDs. The file is a comma separated text file and I read it in using read.csv Could you please tell me if there is a way of doing this such that every random sample contains both the ID and the data. Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Significance test, Cronbach's Alpha
Jim Lemon wrote: Tom Backer Johnsen wrote: Hello: I am reviewing a paper at the moment where the author reports a Cronbach's Alpha and adds a significance test of the value. I was not aware that such a test exists. Does it? If so, how is it done in R? Hi Tom, This may be due to the fact that some interpret Cronbach's alpha as a correlation between items, thus encouraging the unwary to assume that the probability of a numerically equivalent correlation can be used as a test of significance. That is exactly what I suspect. Actually, it is not too far off, but definitely not the same. Tom -- ++ | Tom Backer Johnsen, Psychometrics Unit, Faculty of Psychology | | University of Bergen, Christies gt. 12, N-5015 Bergen, NORWAY | | Tel : +47-5558-9185Fax : +47-5558-9879 | | Email : [EMAIL PROTECTED]URL : http://www.galton.uib.no/ | ++ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Calculation of AIC BIC from mle
MM == Martin Maechler [EMAIL PROTECTED] on Mon, 5 Jun 2006 12:04:32 +0200 writes: Rainer == Rainer M KRug [EMAIL PROTECTED] on Mon, 05 Jun 2006 10:22:02 +0200 writes: Rainer R 2.3.0, all packages up to date Rainer Linux, SuSE 10.0 Rainer Hi Rainer I want to calculate AIC or BIC from several results from mle calculation. Rainer I found the AIC function, but it does not seem to work with objects of Rainer class mle - Rainer If I execute the following: Rainer ml1 - mle(...) Rainer AIC(ml1) Rainer I get the following error messale: Rainer Error in logLik(object) : no applicable method for logLik MM which is really not helpful as error message, since it is *wrong*: MM There is a logLik(.) method for mle objects, and it even MM works. Hence -- as I forgot to mention in my last e-mail -- for now, you can just use AIC(logLik(ml1)) ## AIC or AIC(logLik(ml1), k = log(nobs) ## BIC where you have to fill in nobs yourself. Martin MM There's some embarassing bug here, since if you quickly look at MM the 'stats4' package source, it's very clear that it was designed MM to have AIC(), BIC(), logLik() all working, but only the last MM one does. Rainer Therefore I am using the following to calculate the AIC: Rainer #AICmle calculate AIC from mle object Rainer AICmle - function( m, k=2) Rainer { Rainer lL - logLik(m) Rainer edf - attr(lL, df) Rainer LL - lL[1] Rainer - 2 * LL + k * edf Rainer } Rainer 1) Why is AIC not working with objects of class mle - am I doing Rainer something wrong, is it a bug or by design? MM a bug. Rainer 2) Just for confirmation - is my calculation of AIC correct? MM it looks so, but I didn't check. MM Martin MM __ MM R-help@stat.math.ethz.ch mailing list MM https://stat.ethz.ch/mailman/listinfo/r-help MM PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Calculation of AIC BIC from mle
You are mixing S3 and S4 classes and generics here. AIC(logLik(ml1)) will work. This a namespace issue: stats4 contains a version of logLik, but the default method for AIC is from stats, and so dispatches on stats::logLik and not stats4::logLik. There is something fundamentally unsatisfactory about converting S3 generics to S4 generics where namespaces are involved. I have put a workaround in R-devel. On Mon, 5 Jun 2006, Rainer M KRug wrote: R 2.3.0, all packages up to date Linux, SuSE 10.0 Hi I want to calculate AIC or BIC from several results from mle calculation. I found the AIC function, but it does not seem to work with objects of class mle - If I execute the following: ml1 - mle(...) AIC(ml1) I get the following error messale: Error in logLik(object) : no applicable method for logLik Therefore I am using the following to calculate the AIC: #AICmle calculate AIC from mle object AICmle - function( m, k=2) { lL - logLik(m) edf - attr(lL, df) LL - lL[1] - 2 * LL + k * edf } 1) Why is AIC not working with objects of class mle - am I doing something wrong, is it a bug or by design? 2) Just for confirmation - is my calculation of AIC correct? Not quite. The correct function is stats:::AIC.logLik function (object, ..., k = 2) -2 * c(object) + k * attr(object, df) environment: namespace:stats -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot with different color
array chip wrote: Hi how can I plot a series of number as a line, but with lines above a threshould as one color, and with lines below the threshold as another color. for example, a numeric vector: rnorm(1:100), and plot these numbers in the original order, but lines above the horizontal line at 0 are in red, and lines below in blue? Hi array (or is it, Hi chip?), Is this what you want? x-rnorm(100) x11(width=7,height=7) oldpar-par(no.readonly=TRUE) xmai-par(mai) par(yaxs=i) plot(x,type=n,ylim=c(-3,3),xlab=Observation,ylab=Value) par(mai=c(3.5,xmai[2:4]),yaxs=i,new=TRUE) plot(x,type=l,col=blue,ylim=c(0,3),axes=FALSE,xlab=,ylab=) par(mai=c(xmai[1:2],3.5,xmai[4]),new=TRUE) plot(x,type=l,col=red,ylim=c(-3,0),axes=FALSE,xlab=,ylab=) par(oldpar) This _is_ an ugly kludge. Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] significance test and cronbach
Erin Hodgess wrote: Date: Mon, 05 Jun 2006 08:05:04 +0200 From: Tom Backer Johnsen [EMAIL PROTECTED] User-Agent: Thunderbird 1.5.0.4 (Windows/20060516) MIME-Version: 1.0 To: r-help@stat.math.ethz.ch References: [EMAIL PROTECTED] In-Reply-To: [EMAIL PROTECTED] X-checked-clean: by exiscan on noralf X-Scanner: adae8ccb25ab4e5e1861f38a682801f6 http://tjinfo.uib.no/virus.html X-UiB-SpamFlag: NO UIB: -20.4 hits, 8.0 required X-UiB-SpamReport: spamassassin found; -15 From is listed in 'whitelist_SA' -9.0 Message received from UIB -0.4 Did not pass through any untrusted hosts 4.0 BODY: Probably more lottery X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch Subject: [R] Significance test, Cronbach's Alpha X-BeenThere: r-help@stat.math.ethz.ch X-Mailman-Version: 2.1.8 Precedence: list Hi Tom! There is a package in R called psy One of the functions is cronbach Yes, and that function computes Cronbach's Alpha. I don't know about psychology, so I don't know really how it is all constructed. That is a standard procedure, and rather easy to write in R. My problem is different and has to do with a signigicance test of the value. Tom ++ | Tom Backer Johnsen, Psychometrics Unit, Faculty of Psychology | | University of Bergen, Christies gt. 12, N-5015 Bergen, NORWAY | | Tel : +47-5558-9185Fax : +47-5558-9879 | | Email : [EMAIL PROTECTED]URL : http://www.galton.uib.no/ | ++ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] FW: How to create a new package?
Hi, Thanks very much for your help. I have already created a test package with the package.skeleton() function and the procedures in the article R Help Desk: Make R CMD work under Windows - R News 5 (2), 27-28. Now, my question is: How can I install this package in others PC's without install there the Perl and the remaining components? Thanks, --- Rita Sousa DME - ME: Departamento de Metodologia Estatística - Métodos Estatísticos INE - DRP: Instituto Nacional de Estatística - Delegação Regional do Porto Tel.: 22 6072016 (Extensão: 4116) --- -Original Message- From: Uwe Ligges [mailto:[EMAIL PROTECTED] Sent: quinta-feira, 1 de Junho de 2006 12:43 To: michael watson (IAH-C) Cc: Gabor Grothendieck; Rita Sousa; r-help@stat.math.ethz.ch Subject: Re: [R] FW: How to create a new package? michael watson (IAH-C) wrote: ?package.skeleton Folks, please! Rita Sousa told you she already has a DESCRIPTION file. Obviously, Rita forgot to build and *install* the package using R CMD build and R CMD INSTALL (the Meta directory is creating during the installation procedure) Please note that using the Windows GUI, you can only install binary packages, but you have got a source package so far. Hence you need to install from the Windows command shell using R CMD INSTALL. Please see the R Installation and Administration manual on how to install packages. For some examples of what is mentioned in that manual (how to set stuff up in Windows), you might additionally want to take a look into the article R Help Desk: Make `R CMD' Work under Windows - an Example in R News 5 (2), 27-28. Best, Uwe Ligges -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck Sent: 01 June 2006 12:20 To: Rita Sousa Cc: r-help@stat.math.ethz.ch Subject: Re: [R] FW: How to create a new package? The minimum is to create a DESCRIPTION file, plus R and man directories containing R code and .Rd files respectively. It might help to run Rcmd CHECK mypkg before installation and fix any problems it finds. Googling for creating R package will locate some tutorials. On 6/1/06, Rita Sousa [EMAIL PROTECTED] wrote: Hi, I'm a group of functions and I would like to create a package for load in R. I have created a directory named INE and a directory below that named R, for the files of R functions. A have created the files DESCRIPTION and INDEX in the INE directory. The installation from local zip files, in the R 2.3.0, results but to load the package I get an error like: 'INE' is not a valid package -- installed 2.0.0? I think that is necessary create a Meta directory with package.rds file, but I don't know make it! I have read the manual 'Writing R Extensions - 1. Creating R packages' but I don't understand the procedure... Can I create it automatically? Could you help me with this? Thanks, --- Rita Sousa DME - ME: Departamento de Metodologia Estatística - Métodos Estatísticos INE - DRP: Instituto Nacional de Estatística - Delegação Regional do Porto Tel.: 22 6072016 (Extensão: 4116) --- [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Significance test, Cronbach's Alpha
falissard wrote: This test can be meaningfull, but only in rare circumstancies (are the items (or raters) more consistent than what could be expected by chance?). Do you have a reference to such a test? I would do it with R using a bootstrap procedure (Efron Tibshirani, An introduction to the bootstrap, p.156). I agree. That should be possible, actually quite simple. Run through the data a large number of times, randomize the sequence of the observations for each case and compute alpha for each run. Tom ++ | Tom Backer Johnsen, Psychometrics Unit, Faculty of Psychology | | University of Bergen, Christies gt. 12, N-5015 Bergen, NORWAY | | Tel : +47-5558-9185Fax : +47-5558-9879 | | Email : [EMAIL PROTECTED]URL : http://www.galton.uib.no/ | ++ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Significance test, Cronbach's Alpha
On 06/05/06 19:19, Jim Lemon wrote: Tom Backer Johnsen wrote: Hello: I am reviewing a paper at the moment where the author reports a Cronbach's Alpha and adds a significance test of the value. I was not aware that such a test exists. Does it? If so, how is it done in R? Hi Tom, This may be due to the fact that some interpret Cronbach's alpha as a correlation between items, thus encouraging the unwary to assume that the probability of a numerically equivalent correlation can be used as a test of significance. SPSS has a test for alpha. I don't know what it does. It also seems to me that, if the assumptions of alpha are met, then the assumptions of analysis of variance are also met. In particular, alpha equals the reliability if the measurements (items) are parallel (Lord and Novick, Statistical theories of mental test scores, 1968, pp. 47 and 90 in particular). That means (among other things) that they have equal true variances. If this can be assumed, then you can do an ANOVA using items and subjects, and look for a significant effect of subjects. If the measures are not parallel, then alpha is a lower bound on the reliability of the test, so an ANOVA might be conservative, but I have not thought this through. It is rare to see anyone report a test for alpha because it is usually used descriptively. If it isn't .7 or higher, people get upset, yet even .5 would be wildly significant in most cases. Jon -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron Editor: Judgment and Decision Making (http://journal.sjdm.org) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] FW: How to create a new package?
R CMD INSTALL --build produces a binary package in a zip file. See the `R Installation and Administration manual'. On Mon, 5 Jun 2006, Rita Sousa wrote: Hi, Thanks very much for your help. I have already created a test package with the package.skeleton() function and the procedures in the article R Help Desk: Make R CMD work under Windows - R News 5 (2), 27-28. Now, my question is: How can I install this package in others PC's without install there the Perl and the remaining components? Thanks, --- Rita Sousa DME - ME: Departamento de Metodologia Estatística - Métodos Estatísticos INE - DRP: Instituto Nacional de Estatística - Delegação Regional do Porto Tel.: 22 6072016 (Extensão: 4116) --- -Original Message- From: Uwe Ligges [mailto:[EMAIL PROTECTED] Sent: quinta-feira, 1 de Junho de 2006 12:43 To: michael watson (IAH-C) Cc: Gabor Grothendieck; Rita Sousa; r-help@stat.math.ethz.ch Subject: Re: [R] FW: How to create a new package? michael watson (IAH-C) wrote: ?package.skeleton Folks, please! Rita Sousa told you she already has a DESCRIPTION file. Obviously, Rita forgot to build and *install* the package using R CMD build and R CMD INSTALL (the Meta directory is creating during the installation procedure) Please note that using the Windows GUI, you can only install binary packages, but you have got a source package so far. Hence you need to install from the Windows command shell using R CMD INSTALL. Please see the R Installation and Administration manual on how to install packages. For some examples of what is mentioned in that manual (how to set stuff up in Windows), you might additionally want to take a look into the article R Help Desk: Make `R CMD' Work under Windows - an Example in R News 5 (2), 27-28. Best, Uwe Ligges -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck Sent: 01 June 2006 12:20 To: Rita Sousa Cc: r-help@stat.math.ethz.ch Subject: Re: [R] FW: How to create a new package? The minimum is to create a DESCRIPTION file, plus R and man directories containing R code and .Rd files respectively. It might help to run Rcmd CHECK mypkg before installation and fix any problems it finds. Googling for creating R package will locate some tutorials. On 6/1/06, Rita Sousa [EMAIL PROTECTED] wrote: Hi, I'm a group of functions and I would like to create a package for load in R. I have created a directory named INE and a directory below that named R, for the files of R functions. A have created the files DESCRIPTION and INDEX in the INE directory. The installation from local zip files, in the R 2.3.0, results but to load the package I get an error like: 'INE' is not a valid package -- installed 2.0.0? I think that is necessary create a Meta directory with package.rds file, but I don't know make it! I have read the manual 'Writing R Extensions - 1. Creating R packages' but I don't understand the procedure... Can I create it automatically? Could you help me with this? Thanks, --- Rita Sousa DME - ME: Departamento de Metodologia Estatística - Métodos Estatísticos INE - DRP: Instituto Nacional de Estatística - Delegação Regional do Porto Tel.: 22 6072016 (Extensão: 4116) --- [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] making matrix monotonous
Dear all, I'm currently working on correlation matrix. The function image() is quite useful for visualization. Before using image(), I'd like to sort the matrix rows/columns, in order to make the matrix the more monotonous/smooth possible. Before reinventing the wheel, is there somebody here aware if such a function already exists ? Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] random sample from a file
Hi sample a vector from 1 to the number of rows in of your file and subset a file with sampled values iris[sample(1:nrow(iris), 20),] Sepal.Length Sepal.Width Petal.Length Petal.WidthSpecies 99 5.1 2.5 3.0 1.1 versicolor 53 6.9 3.1 4.9 1.5 versicolor 40 5.1 3.4 1.5 0.2 setosa 146 6.7 3.0 5.2 2.3 virginica 93 5.8 2.6 4.0 1.2 versicolor HTH Petr On 5 Jun 2006 at 11:09, Tibi Codilean wrote: Date sent: Mon, 5 Jun 2006 11:09:50 +0100 From: Tibi Codilean [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject:[R] random sample from a file Dear All, Thanks to all that responded to my earlier query. I have a file with two columns: an ID field and a Data field. I wish to extract multiple random samples of 20 numbers from this file in such a way that the ID field is extracted with the data as well. At the moment I am using the 'sample' function but this only extracts the data without the IDs. The file is a comma separated text file and I read it in using read.csv Could you please tell me if there is a way of doing this such that every random sample contains both the ID and the data. Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Calculation of AIC BIC from mle
Thanks a lot for your and Martin's help - I understand now why it is not working and am using AIC(logLik(ml1)). Thanks Rainer Prof Brian Ripley wrote: You are mixing S3 and S4 classes and generics here. AIC(logLik(ml1)) will work. This a namespace issue: stats4 contains a version of logLik, but the default method for AIC is from stats, and so dispatches on stats::logLik and not stats4::logLik. There is something fundamentally unsatisfactory about converting S3 generics to S4 generics where namespaces are involved. I have put a workaround in R-devel. On Mon, 5 Jun 2006, Rainer M KRug wrote: R 2.3.0, all packages up to date Linux, SuSE 10.0 Hi I want to calculate AIC or BIC from several results from mle calculation. I found the AIC function, but it does not seem to work with objects of class mle - If I execute the following: ml1 - mle(...) AIC(ml1) I get the following error messale: Error in logLik(object) : no applicable method for logLik Therefore I am using the following to calculate the AIC: #AICmle calculate AIC from mle object AICmle - function( m, k=2) { lL - logLik(m) edf - attr(lL, df) LL - lL[1] - 2 * LL + k * edf } 1) Why is AIC not working with objects of class mle - am I doing something wrong, is it a bug or by design? 2) Just for confirmation - is my calculation of AIC correct? Not quite. The correct function is stats:::AIC.logLik function (object, ..., k = 2) -2 * c(object) + k * attr(object, df) environment: namespace:stats -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel:+27 - (0)72 808 2975 (w) Fax:+27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Status of data.table package
List: This package was described in R News of May 2006. However, I cannot find it on CRAN mirrors, and use of RSiteSearch is unsatisfying. I am likewise unable to find an email address for the maintainer of the package. Thank you in advance for assistance. -- Eric Rexstad Research Unit for Wildlife Population Assessment Centre for Research into Ecological and Environmental Modelling University of St. Andrews St. Andrews Scotland KY16 9LZ +44 (0)1334 461833 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Tcltk Default Background Color Question
On Sun, Jun 04, 2006 at 09:55:20PM -0700, Crimson Turquoise wrote: Hello, I am new at tcltk and would like to identify the default background color for widgets. I have found things like systemBackground or tk_setPalette, but am not able to get them to work in R. Is there a name for this color such as lightgray, etc? This works for me: tkcmd(tk_setPalette,gray93) # set am overall background color Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] evir: generalized pareto dist
Hi, I'm fitting a generalized Pareto distribution to POT exceedances of a data set. The practical stuff works ok, but I have a question regarding theory. Is there an equation relating parameters of a gpd tail to its (first) moments? According to theory for certain parameters either the first moment does not exist or the distribution has an upper bound, but I haven't found the mentioned relation about moments in the literatur. Thanks a lot. wt _ Die Vielfalt der Optionen lässt Sie im Internet erfolgreich recherchieren. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] negative binomial: expected number of events?
Hi I'm fitting poisson and negative binomial distributions to event data. I'm interested in the expected number of events occuring in a time period. For the poisson this is determined by the parameter lambda only. For the neg. binomial, is the expected number of events determined by the parameter mu only or does parameter size influence the first moment as well? thank you, wt _ Die MSN Homepage liefert Ihnen alle Infos zu Ihren Lieblingsthemen. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Status of data.table package
It was announced on the R-devel list: http://tolstoy.newcastle.edu.au/R/devel/06/04/4923.html I don't see it on CRAN, either, nor could I find mention of it in the R News you cited. Andy From: Eric Rexstad List: This package was described in R News of May 2006. However, I cannot find it on CRAN mirrors, and use of RSiteSearch is unsatisfying. I am likewise unable to find an email address for the maintainer of the package. Thank you in advance for assistance. -- Eric Rexstad Research Unit for Wildlife Population Assessment Centre for Research into Ecological and Environmental Modelling University of St. Andrews St. Andrews Scotland KY16 9LZ +44 (0)1334 461833 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] case insensitive regular expression
Can anybody fix the following, to match a string disregarding case? I tried: with(my.data.frame, regexpr(pattern = '(?i)J\. Test', text=TARGET_COLUMN), perl=T) != -1) R yields: invalid regular expression with(my.data.frame, regexpr(pattern = 'J\. Test', text=TARGET_COLUMN), perl=T) != -1) works, but it does not match J. TEST. It does match J. Test correctly. I have the feeling I'm not reading the documentation good enough, but cannot seem to figure this out, sorry. Thanks in advance, JeeBee. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] case insensitive regular expression
Found the problem already. Without the 'with' it works, so I know where to look for the problem now. Thanks anyways, JeeBee. On Mon, 05 Jun 2006 15:16:34 +0200, JeeBee wrote: Can anybody fix the following, to match a string disregarding case? I tried: with(my.data.frame, regexpr(pattern = '(?i)J\. Test', text=TARGET_COLUMN), perl=T) != -1) R yields: invalid regular expression with(my.data.frame, regexpr(pattern = 'J\. Test', text=TARGET_COLUMN), perl=T) != -1) works, but it does not match J. TEST. It does match J. Test correctly. I have the feeling I'm not reading the documentation good enough, but cannot seem to figure this out, sorry. Thanks in advance, JeeBee. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Status of data.table package
On Mon, 5 Jun 2006, Liaw, Andy wrote: It was announced on the R-devel list: http://tolstoy.newcastle.edu.au/R/devel/06/04/4923.html I don't see it on CRAN, either, nor could I find mention of it in the R News you cited. In the archive: http://cran.r-project.org/src/contrib/Archive/D/data.table_1.0.tar.gz Roger Andy From: Eric Rexstad List: This package was described in R News of May 2006. However, I cannot find it on CRAN mirrors, and use of RSiteSearch is unsatisfying. I am likewise unable to find an email address for the maintainer of the package. Thank you in advance for assistance. -- Eric Rexstad Research Unit for Wildlife Population Assessment Centre for Research into Ecological and Environmental Modelling University of St. Andrews St. Andrews Scotland KY16 9LZ +44 (0)1334 461833 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] negative binomial: expected number of events?
Hi, This page should answer your questions. http://mathworld.wolfram.com/NegativeBinomialDistribution.html Ritwik Sinha http://darwin.cwru.edu On 6/5/06, Werner Tell [EMAIL PROTECTED] wrote: Hi I'm fitting poisson and negative binomial distributions to event data. I'm interested in the expected number of events occuring in a time period. For the poisson this is determined by the parameter lambda only. For the neg. binomial, is the expected number of events determined by the parameter mu only or does parameter size influence the first moment as well? thank you, wt _ Die MSN Homepage liefert Ihnen alle Infos zu Ihren Lieblingsthemen. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Status of data.table package
Liaw, Andy wrote: I don't see it on CRAN, either, nor could I find mention of it in the R News you cited. p. 66 -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] negative binomial: expected number of events?
Hi, The wikipedia page http://en.wikipedia.org/wiki/Negative_binomial also does a great job of explaining the negative binomial distribution. -- Ritwik Sinha Graduate Student Epidemiology and Biostatistics Case Western Reserve University http://darwin.cwru.edu/~rsinha [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to call a value labels attribute?
Aha! Thank you for the more detailed example. My solution for that situation is an attribute position and function as.position(). I use this in my book Statistical Analysis and Data Display Richard M. Heiberger and Burt Holland The online files for the book are available at http://springeronline.com/0-387-40270-5 For this example, you need the function as.position() included in this email. ### example x - ordered(c(1,2,3,2,4,3,1,2,4,3,2,1,3), labels=c(small, medium, large, very.large)) x attr(x, position) - c(1,2,4,8) x as.position(x) y - rnorm(length(x)) y xyplot(y ~ x) source(~/h2/library/code/as.position.s) xyplot(y ~ as.position(x)) xyplot(y ~ as.position(x), scale=list(x=list(at=attr(x,position), labels=levels(x xyplot(y ~ as.position(x), scale=list(x=list(at=attr(x,position), labels=levels(x))), xlab=x) ### end example ### as.position.s # as.position - function(x) { if (is.numeric(x)) x else { if (!is.factor(x)) stop(x must be either numeric or factor.) if (!is.null(attr(x, position))) x - attr(x, position)[x] else { lev.x - levels(x) if (inherits(x, ordered)) { on.exit(options(old.warn)) old.warn - options(warn=-1) if (!any(is.na(as.numeric(lev.x x - as.numeric(lev.x)[x] else x - as.numeric(ordered(lev.x, lev.x))[x] } else x - as.numeric(x) } } x } ## tmp - ordered(c(c,b,f,f,c,b), c(c,b,f)) ## as.numeric(tmp) ## as.position(tmp) ## ## tmp - factor(c(c,b,f,f,c,b)) ## as.numeric(tmp) ## as.position(tmp) ## ## tmp - factor(c(1,3,5,3,5,1)) ## as.numeric(tmp) ## as.position(tmp) ## ## tmp - ordered(c(1,3,5,3,5,1)) ## as.numeric(tmp) ## as.position(tmp) ## ## tmp - c(1,3,5,3,5,1) ## as.numeric(tmp) ## as.position(tmp) ### end as.position.s # __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to compute Commutation Matrix With R
Hi Is there as specific function to compute a commutation matrix ? Sincerly ! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problem with Sweave
I was trying to use Sweave function. Do you know why the following code in my .Rnw file: results=tex, echo=FALSE= print(xtable(stat.table(list(Couverture du livret = couv), list(n = count(), \\% = percent(couv, floating = F) @ \\ NA = \Sexpr{table(is.na(couv))[2]} gives me this result in my .tex file : % latex table generated in R 2.3.1 by xtable 1.3-2 package % Mon Jun 5 13:24:15 2006 \begin{tabular}{rrr} \hline n \% \\ \hline Plutot negatif 8.00 21.62 \\ Plutot positif 29.00 78.38 \\ \hline \end{tabular}\\ NA = \Sexpr{table( is.na(couv))[2]} and not : NA = 26 (where 26 is the result of table(is.na(couv))[2]) ? I don't understand ! -- David [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merge function in R
(See insertions below) At 8:49 PM -0300 6/2/06, Juan Santiago Ramseyer wrote: Em Qui, 2006-06-01 às 05:59 -0700, Ahamarshan jn escreveu: hi list, This question must be very basic but I am just 3 days old to R. I am trying to find a function to merge two tables of data in two different files as one. Does merge function only fills in colums between two table where data is missing or is there a way that merge can be used to merge data between two matrixes with common dimensions. say i have v1 v2 v3 v4 h1 h2 h3 h4 h5 and and another table with x1 x2 x3 x4 h1 h2 h3 h4 h5 can i merge the data as v1 x1 v2 x2 v3 x3 v4 x4 h1 h2 h3 h4 h5 Thanks hi testing the following sequence # matrix examples x - seq(from=-1,to=-20,step =-1) v - seq(1:20) dim(x) - c(5,4) dim(v) - c(5,4) vx - rep(NA,times=40) dim(vx) - c(5,8) # answer init for (i in 1:4) { vx[,2*i-1] - v[,i] vx[,2*i] - x[,i] } # answer end The loop is more complicated than necessary. vx2 - matrix(numeric(40),ncol=8) vx2[,c(1,3,5,7)] - v vx2[,c(2,4,6,8)] - x ## or more generally vx2[ ,seq(1,by=2,len=4)] - v vx2[ ,seq(2,by=2,len=4)] - x ## if the order of columns did not matter then the simplest is vx3 - cbind(v,x) -Don Juan Santiago Ramseyer Eng. Recursos Hídricos __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] anova for the objects of list
Dear R experts, I have two list objects from runing lme model for each subjects. And my.x is the full model with all terms I am interested in and my.x1 is the null model with only intercept. I am trying to compare two models at each subject level. (Doing anova(my.x[[1]], my.x1[[1]]) is to produce stats for the first subject). My goal is to obtain stats. for an entire set with hundreds of subjects. I did something like this, lapply(my.Tem, function(x)anova(x)), but my.Tem actually is not a list after combining. It did not work the way as I expected. I am very thankful if anyone here could point me out. Kevin 5 subjects extracted as my pilot data. class(my.x) [1] list length(my.x) [1] 5 class(my.x1) [1] list length(my.x1) [1] 5 my.Tem - cbind(my.x, my.x1) class(my.Tem) [1] matrix my.Tem my.xmy.x1 s1 List,15 List,15 s2 List,15 List,15 s3 List,15 List,15 s4 List,15 List,15 s5 List,15 List,15 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Survey - twophase
Dear WizaRds, I am struggling with the use of twophase in package survey. My goal is to compute a simple example in two phase sampling: phase 1: I sample n1=1000 circuit boards and find 80 non functional phase 2: Given the n1=1000 sample I sample n2=100 and find 15 non functional. Let's say, phase 2 shows this result together with phase 1: ...phase1 ...ok defunct phase2 ok..850.85 ...defunct..5...10.15 sum90...10100 That is in R: fail - data.frame(id=1:1000 , x=c(rep(0,920), rep(1,80)), y=c(rep(0,985), rep(1,15)), n1=rep(1000,1000), n2=rep(100,1000), N=rep(5000,1000)) des.fail- twophase(id=list(~id,~id), data=fail, subset=~I(x==1)) #fpc=list(~n1,~n2) svymean(~y, des.fail) gives mean y 0.1875, SE 0.0196, but theoretically, we have x.bar1 (phase1)=0.08 and y.bar2 (phase2)=0.15 defect boards. Two phase sampling assumes some relation between the easily/ fast received x-information and the elaborate/ time-consuming y-information, say a ratio r=sum y (phase2)/ sum x (phase2)=15/10=1.5 (out of the above table) Ergo, the y.ratio estimator = r*x.bar(phase1) = 1.5*0.08 = 0.12 with variance = (n1-n2)/n1 * s_regression^2/n2 + s_y^2/n1 = 900/1000 * 0.0765/100 + 0.129/1000 = .00081 SE .02846 with s_regression^2 = yk=c(rep(0,85), rep(1,15)); xk=c(rep(0,90), rep(1,10)) 1/98*sum((yk-1.5*xk)^2) and s_yk^2 = 1/99 * sum( (yk-.15)^2)=0.1287879 I am sorry to bother you with my false calculations, but I just don't know how to receive the correct results. Please help. My example is taken from Kauermann/ Kuechenhoff 2006, p. 111f. thank you so much yours always mark __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Survey - twophase
On Mon, 5 Jun 2006, Mark Hempelmann wrote: Dear WizaRds, I am struggling with the use of twophase in package survey. My goal is to compute a simple example in two phase sampling: phase 1: I sample n1=1000 circuit boards and find 80 non functional phase 2: Given the n1=1000 sample I sample n2=100 and find 15 non functional. Let's say, phase 2 shows this result together with phase 1: ...phase1 ...ok defunct phase2 ok..850.85 ...defunct..5...10.15 sum90...10100 That is in R: fail - data.frame(id=1:1000 , x=c(rep(0,920), rep(1,80)), y=c(rep(0,985), rep(1,15)), n1=rep(1000,1000), n2=rep(100,1000), N=rep(5000,1000)) des.fail- twophase(id=list(~id,~id), data=fail, subset=~I(x==1)) #fpc=list(~n1,~n2) The second-phase sample is described by subset=~I(x==1), so you have sampled only 80 in phase two, not 100. svymean(~y, des.fail) gives mean y 0.1875, SE 0.0196, but theoretically, we have x.bar1 (phase1)=0.08 and y.bar2 (phase2)=0.15 defect boards. 15/80=0.1875 Two phase sampling assumes some relation between the easily/ fast received x-information and the elaborate/ time-consuming y-information, say a ratio r=sum y (phase2)/ sum x (phase2)=15/10=1.5 (out of the above table) Not quite. Two-phase sampling is *useful* only where there is a relationship. No relationship is *assumed*. There are two ways you can take advantage of a relationship. The first is to stratify the phase-two sampling based on phase one information. In this case you need a strata= argument to twophase(). The second way to use a relationship is to calibrate phase two to phase one, using the calibrate() function. This is analogous to the regression estimator you describe. A good example to look at is in vignette(epi). This describes a two-phase sample where about 4000 people are in the first stage (a cancer clinical trial) and then the second phase is sampled based on relapse and on disease type (histology) determined at the local hospital. Disease type is determined more accurately at a central lab for everyone who relapses, everyone whose locally-determined disease type is bad, and 20% of the rest. There is also an example of calibration, post-stratifying the second phase to the first phase on disease stage, for the same data. Finally, note that twophase() does not use the unbiased estimator of variance. It uses a modification that is easier to compute for cluster samples, as described in vignette(phase1). There is no difference if the first phase is sampled from an infinite population (or with replacement), which is the case in vignette(epi). -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Multivariate skew-t cdf
You want to evaluate skewed t probabilities in how many dimensions? If 27 is your maximum, the problem won't be as difficult as if 27 is your minimum. Also, do you want to compute the multivariate cumulative probability function for arbitrary points, location, covariance, shape and degrees of freedom? Or are you really only interested in certain special case(s)? If you have a simpler special case, it might be easier to get a solution. I was able to replicate your result, even when I reduced the dimensionality down to 20; with 19 dimensions, the function seemed to return a sensible answer. If it were my problem, I might first make a local copy of the pmst function and modify it to use the mvtnorm package in place of mnormt. That might get you answers with somewhat higher dimensionality, though it might not be adequate -- and I wouldn't trust the numbers I got without serious independent testing. I'd try to think how I could modify the skewness so I could check the accuracy that way. Have you studied the reference mentioned in the dmst help page, and reviewed some of their sources? Computing multivariate probabilities like this is still a research project, I believe. In this regard, I found the following two books helpful: * Evans and Schwarz (2000) Approximating Integrals via Monte Carlo and Deterministic Methods (Oxford) * Kythe and Schaeferkotter (2005) Handbook of Computational Methods for Integration (Chapman and Hall). Also, have you asked about this directly to the maintainers of the sn, mnormt and mvtnorm packages? They might have other suggestions. Hope this helps. Spencer Graves p.s. Thanks for the self-contained example. There seems to be a typo in your example: Omega = diag(0, 27) is the matrix of all zeros, which produces a point mass at the center of the distribution. I got your answers after changing it to diag(1, 27). Making the dimension a variable, I found a sharp transition between k = 19 and 20: k - 19 xi - alpha - x - rep(0,k) Omega - diag(1,k) (p19 - pmst(x, xi, Omega, alpha, df = 5)) [1] 1.907349e-06 attr(,error) [1] 2.413537e-20 attr(,status) [1] normal completion k - 20 xi - alpha - x - rep(0,k) Omega - diag(1,k) (p20 - pmst(x, xi, Omega, alpha, df = 5)) [1] 0 attr(,error) [1] 1 attr(,status) [1] oversize Konrad Banachewicz wrote: Dear All, I am using the pmst function from the sn package (version 0.4-0). After inserting the example from the help page, I get non-trivial answers, so everything is fine. However, when I try to extend it to higher dimension: xi - alpha - x - rep(0,27) Omega - diag(0,27) p1 - pmst(x, xi, Omega, alpha, df = 5) I get the following result: p1 [1] 0 attr(,error) [1] 1 attr(,status) [1] oversize So it seems like the dimension is a problem here (and not the syntax or type mismatch, as I inititally thought - the function is evaluated) - although I found no warning about it in the help page. Can anyone give me a hint as to how to work around this problem and evaluate the skew-t cdf in a large-dimensional space? It's pretty crucial to my current research. Thanks in advance, Konrad [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Multivariate skew-t cdf
On Mon, Jun 05, 2006 at 09:43:24AM -0700, Spencer Graves wrote: Thanks to Spencer Graves for provinding this clarification about pmst. As the author of pmst, I was really the one expected to answer the query, but I was on travel in the last two weeks and did not read this query. As a complement to what already explained, the reason of the sharp change from k=19 to k=20 is as follows: pmst (package sn) calls pmt (package mnormt) with k increased by 1, and pmt calls a Fortran routine (written by Alan Genz), which issues an error if k20 or k1. Hence, the effective maximum number of dimensions allowed by pmst is 19. best wishes, Adelchi Azzalini You want to evaluate skewed t probabilities in how many dimensions? If 27 is your maximum, the problem won't be as difficult as if 27 is your minimum. Also, do you want to compute the multivariate cumulative probability function for arbitrary points, location, covariance, shape and degrees of freedom? Or are you really only interested in certain special case(s)? If you have a simpler special case, it might be easier to get a solution. I was able to replicate your result, even when I reduced the dimensionality down to 20; with 19 dimensions, the function seemed to return a sensible answer. If it were my problem, I might first make a local copy of the pmst function and modify it to use the mvtnorm package in place of mnormt. That might get you answers with somewhat higher dimensionality, though it might not be adequate -- and I wouldn't trust the numbers I got without serious independent testing. I'd try to think how I could modify the skewness so I could check the accuracy that way. Have you studied the reference mentioned in the dmst help page, and reviewed some of their sources? Computing multivariate probabilities like this is still a research project, I believe. In this regard, I found the following two books helpful: * Evans and Schwarz (2000) Approximating Integrals via Monte Carlo and Deterministic Methods (Oxford) * Kythe and Schaeferkotter (2005) Handbook of Computational Methods for Integration (Chapman and Hall). Also, have you asked about this directly to the maintainers of the sn, mnormt and mvtnorm packages? They might have other suggestions. Hope this helps. Spencer Graves p.s. Thanks for the self-contained example. There seems to be a typo in your example: Omega = diag(0, 27) is the matrix of all zeros, which produces a point mass at the center of the distribution. I got your answers after changing it to diag(1, 27). Making the dimension a variable, I found a sharp transition between k = 19 and 20: k - 19 xi - alpha - x - rep(0,k) Omega - diag(1,k) (p19 - pmst(x, xi, Omega, alpha, df = 5)) [1] 1.907349e-06 attr(,error) [1] 2.413537e-20 attr(,status) [1] normal completion k - 20 xi - alpha - x - rep(0,k) Omega - diag(1,k) (p20 - pmst(x, xi, Omega, alpha, df = 5)) [1] 0 attr(,error) [1] 1 attr(,status) [1] oversize Konrad Banachewicz wrote: Dear All, I am using the pmst function from the sn package (version 0.4-0). After inserting the example from the help page, I get non-trivial answers, so everything is fine. However, when I try to extend it to higher dimension: xi - alpha - x - rep(0,27) Omega - diag(0,27) p1 - pmst(x, xi, Omega, alpha, df = 5) I get the following result: p1 [1] 0 attr(,error) [1] 1 attr(,status) [1] oversize So it seems like the dimension is a problem here (and not the syntax or type mismatch, as I inititally thought - the function is evaluated) - although I found no warning about it in the help page. Can anyone give me a hint as to how to work around this problem and evaluate the skew-t cdf in a large-dimensional space? It's pretty crucial to my current research. Thanks in advance, Konrad [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Adelchi Azzalini [EMAIL PROTECTED] Dipart.Scienze Statistiche, Università di Padova, Italia tel. +39 049 8274147, http://azzalini.stat.unipd.it/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] anova for the objects of list
Dear R experts, Sorry for the previous email. I just figured it out myself. I would welcome other solutions. Thanks. Kevin my.anova - sapply(names(my.x), function(x) { anova(my.x[[x]], my.x1[[x]]) } ) --- Hai Lin [EMAIL PROTECTED] wrote: Dear R experts, I have two list objects from runing lme model for each subjects. And my.x is the full model with all terms I am interested in and my.x1 is the null model with only intercept. I am trying to compare two models at each subject level. (Doing anova(my.x[[1]], my.x1[[1]]) is to produce stats for the first subject). My goal is to obtain stats. for an entire set with hundreds of subjects. I did something like this, lapply(my.Tem, function(x)anova(x)), but my.Tem actually is not a list after combining. It did not work the way as I expected. I am very thankful if anyone here could point me out. Kevin 5 subjects extracted as my pilot data. class(my.x) [1] list length(my.x) [1] 5 class(my.x1) [1] list length(my.x1) [1] 5 my.Tem - cbind(my.x, my.x1) class(my.Tem) [1] matrix my.Tem my.xmy.x1 s1 List,15 List,15 s2 List,15 List,15 s3 List,15 List,15 s4 List,15 List,15 s5 List,15 List,15 __ Do You Yahoo!? protection around http://mail.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Significance test, Cronbach's Alpha
Jonathan Baron sent the following at 05/06/2006 12:05: ... some snipped ... It is rare to see anyone report a test for alpha because it is usually used descriptively. If it isn't .7 or higher, people get upset, yet even .5 would be wildly significant in most cases. Jon Feldt did a lot of good work on ANOVA models of alpha, well summarised in his paper with Salih. Here are some functions. Please don't ridicule my R style, I'm a psychotherapist first, researcher second, and R enthusiast third. Amused advice on how to write better code warmly received. I'm sure that jackknifing or bootstrapping would give much more distributionally robust CIs and p values but, as Jon's point above makes so simply, the real problem is that people don't think through what they're looking for from an alpha. I find there are situations in which I'm genuinely interested in the null: is there some evidence of covariance here?; and other situations where I want a high alpha because I'm postulating that we have a useful measure, in the latter case, all these totemistic values that are acceptable, excellent etc. are often misleading and a CI around the observed alpha and some exploration of the factor structure, EFA or CFA, or IRT model explorations, will be far more important than exactly what alpha you got. Oops /FLAME not quite sure where I should have put the starter on that! I'll go back to enjoying the fact that I think this is the first time I've posted something that might be useful to someone! Very best all: Chris feldt1.return - function(obs.a, n, k, ci = 0.95, null.a = 0) { if(obs.a null.a) f - (1 - obs.a)/(1 - null.a) else f - (1 - null.a)/(1 - obs.a) # allows for testing against a higher null n.den - (n - 1) * (k - 1) n.num - n - 1 null.p - pf(f, n.num, n.den) # set the upper and lower p values for the desired C.I. p1 - (1 - ci)/2 p2 - ci + p1 # corresponding F values f1 - qf(p1, n.num, n.den) f2 - qf(p2, n.num, n.den) # confidence interval lwr - 1 - (1 - obs.a) * f2 upr - 1 - (1 - obs.a) * f1 cat(round(lwr,2), to,round(upr,2),\n) interval - list(lwr,upr) return(interval) } feldt1.lwr - function(obs.a, n, k, ci = 0.95, null.a = 0) { if(obs.a null.a) f - (1 - obs.a)/(1 - null.a) else f - (1 - null.a)/(1 - obs.a) # allows for testing against a higher null n.den - (n - 1) * (k - 1) n.num - n - 1 null.p - pf(f, n.num, n.den) # set the upper and lower p values for the desired C.I. p1 - (1 - ci)/2 p2 - ci + p1 # corresponding F values f1 - qf(p1, n.num, n.den) f2 - qf(p2, n.num, n.den) # confidence interval lwr - 1 - (1 - obs.a) * f2 return(lwr) } feldt1.upr - function(obs.a, n, k, ci = 0.95, null.a = 0) { if(obs.a null.a) f - (1 - obs.a)/(1 - null.a) else f - (1 - null.a)/(1 - obs.a) # allows for testing against a higher null n.den - (n - 1) * (k - 1) n.num - n - 1 null.p - pf(f, n.num, n.den) # set the upper and lower p values for the desired C.I. p1 - (1 - ci)/2 p2 - ci + p1 # corresponding F values f1 - qf(p1, n.num, n.den) f2 - qf(p2, n.num, n.den) # confidence interval upr - 1 - (1 - obs.a) * f1 return(upr) } -- Chris Evans [EMAIL PROTECTED] Hon. Professor of Psychotherapy, Nottingham University; Consultant Psychiatrist in Psychotherapy, Rampton Hospital; Research Programmes Director, Nottinghamshire NHS Trust; Hon. SL Institute of Psychiatry, Hon. Con., Tavistock Portman Trust **If I am writing from one of those roles, it will be clear. Otherwise** **my views are my own and not representative of those institutions** __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Editor/environment problem
I have encountered an odd problem in editing a function. I found no mention of this in the archives but may have missed something. I'm using Version 2.3.0 (2006-04-24). The problem is that after creating a function with fix, making some typos and correcting them, my function loses contact with everything in the search path except (apparently) package:base. Here is an example: fix(tmp) tmpfunction (x) { var(x)} tmp(1:5)[1] 2.5 fix(tmp) # Typos (,/ at end of var(x) ) were intentionally introduced# line is now: var(x),/ Error in edit(name, file, title, editor) : an error occurred on line 3 use a command like x - edit() to recover tmp - edit() # Typos corrected but new ones introduced# Line is now var(x) Error in edit(name, file, title, editor) : an error occurred on line 3 use a command like x - edit() to recover tmp - edit() # Typos fixed.# List the function. Note the environment: base at end. Not there before. tmpfunction (x) { var(x)}environment: base tmp(1:5)Error in tmp(1:5) : could not find function var TIARichard [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Fastest way to do HWE.exact test on 100K SNP data?
Hi everyone, I'm using the function 'HWE.exact' of 'genetics' package to compute p-values of the HWE test. My data set consists of ~600 subjects (cases and controls) typed at ~ 10K SNP markers; the test is applied separately to cases and controls. The genotypes are stored in a list of 'genotype' objects, all.geno, and p-values are calculated inside the loop over all SNP markers. I wish to repeat this procedure multiple times (~1000) permuting the cases and controls (affection status). It seems straightforward to implement it like this: # for (iter in 1:1000) { set.seed(iter) # get the permuted affection status permut - sample(affSt) for (j in 1:nSNPs) { test - tapply(all.geno[[j]], permut, HWE.exact) pvalControls[j] - test$1$p.value pvalCases[j] - test$2$p.value } } ## The problem is that it takes ~1 min/iteration (on AMD Opteron 252 processor running Linux). Is there a faster/more efficient way to do this? Thanks, -- Anna Pluzhnikov, PhD Section of Genetic Medicine Department of Medicine The University of Chicago - This email is intended only for the use of the individual or...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Editor/environment problem
This has been fixed in 2.3.1. See http://finzi.psych.upenn.edu/R/Rhelp02a/archive/76657.html or http://finzi.psych.upenn.edu/R/Rhelp02a/archive/76953.html Duncan Murdoch On 6/5/2006 3:19 PM, Richard Valliant wrote: I have encountered an odd problem in editing a function. I found no mention of this in the archives but may have missed something. I'm using Version 2.3.0 (2006-04-24). The problem is that after creating a function with fix, making some typos and correcting them, my function loses contact with everything in the search path except (apparently) package:base. Here is an example: fix(tmp) tmpfunction (x) { var(x)} tmp(1:5)[1] 2.5 fix(tmp) # Typos (,/ at end of var(x) ) were intentionally introduced# line is now: var(x),/ Error in edit(name, file, title, editor) : an error occurred on line 3 use a command like x - edit() to recover tmp - edit() # Typos corrected but new ones introduced# Line is now var(x) Error in edit(name, file, title, editor) : an error occurred on line 3 use a command like x - edit() to recover tmp - edit() # Typos fixed.# List the function. Note the environment: base at end. Not there before. tmpfunction (x) { var(x)}environment: base tmp(1:5)Error in tmp(1:5) : could not find function var TIARichard [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Constrained regression
Hi, I would like to run a constrained OLS, with the following constraints: 1. sum of coefficients must be 1 2. all the coefficients have to be positive. Is there an eas way to specify that in R Thank you, Daniel Bogod [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Status of data.table package
Roger Bivand wrote: On Mon, 5 Jun 2006, Liaw, Andy wrote: It was announced on the R-devel list: http://tolstoy.newcastle.edu.au/R/devel/06/04/4923.html I don't see it on CRAN, either, nor could I find mention of it in the R News you cited. In the archive: http://cran.r-project.org/src/contrib/Archive/D/data.table_1.0.tar.gz AFAIR, the author (I am CCing) decided to remove it from CRAN's main repository because R has improved issues the package tries to address. Hence the package is no longer that beneficial in newer versions of R. Uwe Ligges Roger Andy From: Eric Rexstad List: This package was described in R News of May 2006. However, I cannot find it on CRAN mirrors, and use of RSiteSearch is unsatisfying. I am likewise unable to find an email address for the maintainer of the package. Thank you in advance for assistance. -- Eric Rexstad Research Unit for Wildlife Population Assessment Centre for Research into Ecological and Environmental Modelling University of St. Andrews St. Andrews Scotland KY16 9LZ +44 (0)1334 461833 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] simpel POSIXlt question
How can I access, either to get or set, the sec, min, hour, fields of a POSIXlt value as in this example? t= as.POSIXlt(2004-11-01) t [1] 2004-11-01 unclass(t) $sec [1] 0 $min [1] 0 $hour [1] 0 $mday [1] 1 $mon [1] 10 $year [1] 104 $wday [1] 1 $yday [1] 305 $isdst [1] -1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Constrained regression
If you haven't already done so, please make use of R's search capabilities before posting. help.search('constrained regression') RSiteSearch('constrained regression') ## also available through CRAN's search functionality. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Daniel Bogod Sent: Monday, June 05, 2006 1:01 PM To: r-help@stat.math.ethz.ch Subject: [R] Constrained regression Hi, I would like to run a constrained OLS, with the following constraints: 1. sum of coefficients must be 1 2. all the coefficients have to be positive. Is there an eas way to specify that in R Thank you, Daniel Bogod [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] simpel POSIXlt question
On Mon, 5 Jun 2006, Omar Lakkis wrote: How can I access, either to get or set, the sec, min, hour, fields of a POSIXlt value as in this example? Just as for any other list: t - as.POSIXlt(2004-11-01) t$sec [1] 0 t$sec - 30 t [1] 2004-11-01 00:00:30 and via [[sec]] if you prefer. [ ... ] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] CCA Plot
I am a beginner with R and have managed to use vegan to run a CCA on my data. I am evaluating the habitat relationships of select species in riparian areas around creeks before and after logging across three sampling intervals (pre-logging, two-years post-logging,. and 10 years post-logging). I sampled from 18 sites of 3 treatments (n = 6 sites for each treatment). When I plot my pre-treatment data, I am plotting data from all 18 sites and have selected 12 species and boiled my habitat variables down to 10. The plot shows all 18 sites, all 12 species, and all 10 habitat variable - this is good. However, when I do the CCA on a sub-set of my data, (e.g., Control sites only [n = 6]) the plot displays all 6 sites, all 12 species, but only 5 of the 10 habitat variables. Any ideas why the tri-plot of the sub-set of my data will not show all of the habitat variables? Am I encountering a minimum sample size problem? How can I plot all environmental variables? Thanks in advance, Virgil Hawkes [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Selective Survival Statistics with R
Hello friends and fellow R users, I have a problem to which I have been unable to find a solution: I am gathering survival data on patients undergoing treatment with a new kind of stent. I want to generate survival data and plot survival curves of these patients based (among other things) on the treating physician. My data set has been tabulated in the following manner: Date(the date the stent was implanted) Description (diameter of the stent) Patient (name) MRN (ID number) Age (age in years of the patient) Physician (last name of the implanting physician) status (0 or 1) days (days alive since the stenting procedure) Cr INR BR MELD The data set has over ten physicians. Following the examples in Introductory Statistics with R I have been able to draw cumulative survival curves and even the survival curves by physician but there are so many curves on the graph when I do, it is uninterpretable. I would just like to plot the survival curves of say the top two or three. That would be much more useful. I have been able to create a Surv object out of my own survival data and that of a colleague in the following way using indexing: Surv(days[viatorr[6]==Pierce| viatorr[6]==Ed],status[viatorr[6]==Pierce|viatorr[6]==Ed]==1) [1] 558+ 474+ 472+ 446+ 443+ 429+ 401 395+ 390 390+ 362+ 354 344+ 342 326+ [16] 300+ 284+ 280+ 246+ 237+ 233+ 233 230+ 230+ 225+ 215 199+ 191+ 191 184+ [31] 161+ 153 150 129+ 69+ 52+ 38+ I can get even further: surv.ee.gp-Surv(days[viatorr[6]==Pierce| viatorr[6]==Ed],status[viatorr[6]==Pierce|viatorr[6]==Ed]==1) survfit(surv.ee.gp) Call: survfit(formula = surv.ee.gp) n events median 0.95LCL 0.95UCL 37 9 Inf 390 Inf But now if I want to plot the data using the following command plot(survfit(surv.ee.gp)~Physician) I receive an error: Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type I have tried indexing Physician, but that fails as well: plot(survfit(surv.ee.gp)~Physician[viatorr[6]==Pierce| viatorr[6]==Ed]==1) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type I apologize for this long-winded explanation but I am new to this program and even newer to Statistics and I wanted to make sure the context and problem were clear (hopefully). I would appreciate any guidance in a way forward. Thank you, Greg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Selective Survival Statistics with R
On Mon, 2006-06-05 at 18:54 -0400, Barker, Chris [SCIUS] wrote: Its probably easiest/fastest for you either to subset your dataset first, or else simply use the subset option in survfit() e.g. survfit( ) has a subset option, survfit( Surv( , ) ~ physician , subset=='Jones) Chris, Thank you very much for your kind reply. Using subset worked. I had to modify the syntax a little from what you posted: survfit(Surv(days,status==1)~Physician,subset(viatorr,viatorr[6]==Pierce) where viatorr is the name of my data set. Applying plot to the function above generated a survival curve for my patients. I was also able to plot survival curves for other physicians as well. This is great! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Survival Statistics
Hello friends and fellow R users, I have a problem to which I have been unable to find a solution: I am gathering survival data on patients undergoing treatment with a new kind of stent. I want to generate survival data and plot survival curves of these patients based (among other things) on the treating physician. My data set has been tabulated in the following manner: Date(the date the stent was implanted) Description (diameter of the stent) Patient (name) MRN (ID number) Age (age in years of the patient) Physician (last name of the implanting physician) status (0 or 1) days (days alive since the stenting procedure) Cr INR BR MELD The data set has over ten physicians. Following the examples in Introductory Statistics with R I have been able to draw cumulative survival curves and even the survival curves by physician but there are so many curves on the graph when I do, it is uninterpretable. I would just like to plot the survival curves of say the top two or three. That would be much more useful. I have been able to create a Surv object out of my own survival data and that of a colleague in the following way using indexing: Surv(days[viatorr[6]==Pierce| viatorr[6]==Ed],status[viatorr[6]==Pierce|viatorr[6]==Ed]==1) [1] 558+ 474+ 472+ 446+ 443+ 429+ 401 395+ 390 390+ 362+ 354 344+ 342 326+ [16] 300+ 284+ 280+ 246+ 237+ 233+ 233 230+ 230+ 225+ 215 199+ 191+ 191 184+ [31] 161+ 153 150 129+ 69+ 52+ 38+ I can get even further: surv.ee.gp-Surv(days[viatorr[6]==Pierce| viatorr[6]==Ed],status[viatorr[6]==Pierce|viatorr[6]==Ed]==1) survfit(surv.ee.gp) Call: survfit(formula = surv.ee.gp) n events median 0.95LCL 0.95UCL 37 9 Inf 390 Inf But now if I want to plot the data using the following command plot(survfit(surv.ee.gp)~Physician) I receive an error: Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type I have tried indexing Physician, but that fails as well: plot(survfit(surv.ee.gp)~Physician[viatorr[6]==Pierce| viatorr[6]==Ed]==1) Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type I apologize for this long-winded explanation but I am new to this program and even newer to Statistics and I wanted to make sure the context and problem were clear (hopefully). I would appreciate any guidance in a way forward. Thank you, Greg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Selective Survival Statistics with R
On Mon, 5 Jun 2006, Gregory Pierce wrote: On Mon, 2006-06-05 at 18:54 -0400, Barker, Chris [SCIUS] wrote: Its probably easiest/fastest for you either to subset your dataset first, or else simply use the subset option in survfit() e.g. survfit( ) has a subset option, survfit( Surv( , ) ~ physician , subset=='Jones) Chris, Thank you very much for your kind reply. Using subset worked. I had to modify the syntax a little from what you posted: survfit(Surv(days,status==1)~Physician,subset(viatorr,viatorr[6]==Pierce) where viatorr is the name of my data set. An example of another approach is given in the example on the help page for survfit You could do curves - survfit(Surv(days,status==1)~Physician, data=viatorr) plot(curves[1]) plot(curves[2]) etc. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Selective Survival Statistics with R
On Mon, 2006-06-05 at 17:04 -0700, Thomas Lumley wrote: On Mon, 5 Jun 2006, Gregory Pierce wrote: On Mon, 2006-06-05 at 18:54 -0400, Barker, Chris [SCIUS] wrote: Its probably easiest/fastest for you either to subset your dataset first, or else simply use the subset option in survfit() e.g. survfit( ) has a subset option, survfit( Surv( , ) ~ physician , subset=='Jones) Chris, Thank you very much for your kind reply. Using subset worked. I had to modify the syntax a little from what you posted: survfit(Surv(days,status==1)~Physician,subset(viatorr,viatorr[6]==Pierce) where viatorr is the name of my data set. An example of another approach is given in the example on the help page for survfit You could do curves - survfit(Surv(days,status==1)~Physician, data=viatorr) plot(curves[1]) plot(curves[2]) etc. -thomas That also worked great! Thanks. This makes me very,very happy!!! I can now go to our statistician without looking like a complete idiot and I can get to work collecting survival data on our patients who had the old-fashioned bare-metal Wallstent. Looking back in this way provides a very interesting and humbling perspective. Greg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] time series clustering
I know of no software for time series clustering in R. Google produced some interesting hits for time series clustering. If you find an algorithm you like, the author might have software. Alternatively, the algorithm might be a modification of something already available in R. If that's the case, you wouldn't need to start from scratch to program something since R is open source. hope this helps. Spencer Graves Weiwei Shi wrote: Dear Listers: I happened to have a problem requiring time-series clustering since the clusters will change with time (too old data need to be removed from data while new data comes in). I am wondering if there is some paper or reference on this topic and there is some kind of implementation in R? Thanks, Weiwei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] error bars in lattice xyplot *with groups*
Hi all, I'm trying to plot error bars in a lattice plot generated with xyplot. Deepayan Sarkar has provided a very useful solution for simple circumstances (https://stat.ethz.ch/pipermail/r-help/2005-October/081571.html), yet I am having trouble getting it to work when the groups setting is enabled in xyplot (i.e. multiple lines). To illustrate this, consider the singer data generated by the above linked solution previously submitted: # library(lattice) singer.split - with(singer, split(height, voice.part)) singer.ucl - sapply(singer.split, function(x) { st - boxplot.stats(x) c(st$stats[3], st$conf) }) singer.ucl - as.data.frame(t(singer.ucl)) names(singer.ucl) - c(median, lower, upper) singer.ucl$voice.part - factor(rownames(singer.ucl), levels = rownames(singer.ucl)) #now let's split up the voice.part factor into two factors, singer.ucl$voice=factor(rep(c(1,2),4)) singer.ucl$range=factor(rep(c(Bass,Tenor,Alto,Soprano),each=2)) #here's Deepayan's previous solution, slightly modified to depict # the dependent variable (median) and the error bars on the y-axis # and the independent variable (voice.part) on the x-axis prepanel.ci - function(x, y, ly, uy, subscripts, ...) { x - as.numeric(x) ly - as.numeric(ly[subscripts]) uy - as.numeric(uy[subscripts]) list(ylim = range(y, uy, ly, finite = TRUE)) } panel.ci - function(x, y, ly, uy, subscripts, pch = 16, ...) { x - as.numeric(x) y - as.numeric(y) ly - as.numeric(ly[subscripts]) uy - as.numeric(uy[subscripts]) panel.arrows(x, ly, x, uy, col = black, length = 0.25, unit = native, angle = 90, code = 3) panel.xyplot(x, y, pch = pch, ...) } #this graph works xyplot(median ~ voice.part, data=singer.ucl, ly = singer.ucl$lower, uy = singer.ucl$upper, prepanel = prepanel.ci, panel = panel.ci, type=b ) #this one does not (it will plot, but will not seperate the groups) xyplot(median ~ voice, groups=range, data=singer.ucl, ly = singer.ucl$lower, uy = singer.ucl$upper, prepanel = prepanel.ci, panel = panel.ci, type=b ) Any suggestions? -- Mike Lawrence [EMAIL PROTECTED] The road to Wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less. - Piet Hein __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fastest way to do HWE.exact test on 100K SNP data?
Anna -- This might be a little faster ... all.geno.df - data.frame(all.geno) # exploit shared structure set.seed(123) # once only for (iter in 1:1000) { permut - sample(affSt) control - all.geno.df[permut==1,] case - all.geno.df[permut==2,] pvalControls - unlist(sapply(control, HWE.exact)[p.value,]) pvalCases - unlist(sapply(case, HWE.exact)[p.value,]) # presmuably do something with these, before next iteration } but probably most of the time is being spent in HWE.exact so these cosmetic changes won't help much. Looking at the code for HWE.exact, it looks like it generates a table based on numbers of alleles, and then uses this table to determine the probability. Any time two SNPs have the same number of alleles, the same table gets generated. It seems like this will happen alot. So one strategy is to only calculate the table once, 'remember' it, and then the next time a value is needed from that table look it up rather than calculate it. This is a bad strategy if most SNPs have different numbers of alleles (because then the tables get calculated, and a lot of space and some time is used to store results that are never accessed again). Here is the code (not tested extensively, so please use with care!): HWE.exact.lookup - function() { HWE.lookup.tbl - new.env(hash=TRUE) dhwe2 - function(n11, n12, n22) { ## from body of HWE.exact f - function(x) lgamma(x + 1) n - n11 + n12 + n22 n1 - 2 * n11 + n12 n2 - 2 * n22 + n12 exp(log(2) * (n12) + f(n) - f(n11) - f(n12) - f(n22) - f(2 * n) + f(n1) + f(n2)) } function (x) { ## adopted from HWE.exact if (!is.genotype(x)) stop(x must be of class 'genotype' or 'haplotype') nallele - length(na.omit(allele.names(x))) if (nallele != 2) stop(Exact HWE test can only be computed for 2 markers with 2 alleles) allele.tab - table(factor(allele(x, 1), levels = allele.names(x)), factor(allele(x, 2), levels = allele.names(x))) n11 - allele.tab[1, 1] n12 - allele.tab[1, 2] + allele.tab[2, 1] n22 - allele.tab[2, 2] n1 - 2 * n11 + n12 n2 - 2 * n22 + n12 nm - paste(n1,n2,sep=.) if (!exists(nm, envir=HWE.lookup.tbl)) { # 'remember' x12 - seq(n1%%2, min(n1, n2), 2) x11 - (n1 - x12)/2 x22 - (n2 - x12)/2 dist - data.frame(n11 = x11, n12 = x12, n22 = x22, density = dhwe2(x11, x12, x22)) dist - dist[order(dist$density),] dist$density - cumsum(dist$density) assign(nm,dist, envir=HWE.lookup.tbl) } dist - get(nm, HWE.lookup.tbl) dist$density[dist$n11 == n11 dist$n12 == n12 dist$n22 == n22] } } calcHWE - HWE.exact.lookup() # create a new lookup table function all.geno.df - data.frame(all.geno) # exploit shared structure set.seed(123) # once only for (iter in 1:1000) { permut - sample(affSt) control - all.geno.df[permut==1,] case - all.geno.df[permut==2,] pvalControls - sapply(control, calcHWE) pvalCases - sapply(case, calcHWE) } Let me know how it goes if you adopt this approach! Martin Anna Pluzhnikov [EMAIL PROTECTED] writes: Hi everyone, I'm using the function 'HWE.exact' of 'genetics' package to compute p-values of the HWE test. My data set consists of ~600 subjects (cases and controls) typed at ~ 10K SNP markers; the test is applied separately to cases and controls. The genotypes are stored in a list of 'genotype' objects, all.geno, and p-values are calculated inside the loop over all SNP markers. I wish to repeat this procedure multiple times (~1000) permuting the cases and controls (affection status). It seems straightforward to implement it like this: # for (iter in 1:1000) { set.seed(iter) # get the permuted affection status permut - sample(affSt) for (j in 1:nSNPs) { test - tapply(all.geno[[j]], permut, HWE.exact) pvalControls[j] - test$1$p.value pvalCases[j] - test$2$p.value } } ## The problem is that it takes ~1 min/iteration (on AMD Opteron 252 processor running Linux). Is there a faster/more efficient way to do this? Thanks, -- Anna Pluzhnikov, PhD Section of Genetic Medicine Department of Medicine The University of Chicago - This email is intended only for the use of the individual or...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing
Re: [R] Fastest way to do HWE.exact test on 100K SNP data?
Hi Anna, I am not really answering your question, but, here goes my (unsolicited) suggestion. Some time back I did some simulations to see how the HWE tests performed. In particular I compared the exact test and the chi squared test. Look at the attached figure Rplots.ps. I saw that for null simulations with 40 individuals, the HWE chi squared test was reasonably close to the (expected) uniform distribution. However this was obtained using the function HWE.chisq(X, simulate.p.value=F), the default seemed to have some issues. Hence my suggestion would be, if you feel comfortable, to replace the exact test with a chi sq test, at least at the screening level. Once you identify a set of SNPs with small p-values, you could follow them up with the exact test. -- Ritwik Sinha Graduate Student Epidemiology and Biostatistics Case Western Reserve University http://darwin.cwru.edu/~rsinha http://darwin.cwru.edu/%7Ersinha Rplots.ps Description: PostScript document __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] default value for cutoff in gstat variogram()
Edzer J Pebesma a écrit : Patrick Giraudoux wrote: I wonder what is the default value for the argument 'cutoff' when not specified in the variogram.formula function of gstat. Computing variogram envelops within gstat, I am comparing the results obtained with variog in geoR and variogram in gstat, and it took me a while before understanding that the cutoff default value is not the maximum distance. Can Edzer tell us about it? Yes, of course : the default value is computed in the c code. Without checking (meaning: from 10 years memory) I do recall that gstat uses one third of the diagional of the rectangular (or block for 3D) that spans the data locations. Why? In time series you compute ACF's up to one half of the length of the series; after this things start to oscillate because you lack independent replication at large distance; look at what is meant by ergodicity for further reading. Variograms are basically flipped unscaled acf's for higher dimensions. Some books (Journel Huijbregts?) gave suggestions that half the max. distance in the data set is a good guideline, back in 1978. I used one third of the diagonal because I thought finding the maximum distance between any too point pairs may be expensive to find for large data sets. The parameter one third can be overridden by those who don't like it. Please keep us updated about your milage comparing gstat and geoR; I once spent an afternoon on this, trying to reproduce sample variogram across the packages and found this hard (but not impossible). I had the feeling it had to do with using or = to decide whether a point pairs falls in a distance interval or not, but didn't 100% assure myself. -- Edzer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html