AW: [R] Problem with German special characters
Dear Prof. Ripley, thanks for your help. Everything is working fine! With best regards Michael Wolf -Ursprüngliche Nachricht- Von: Brian D Ripley [mailto:[EMAIL PROTECTED] Gesendet: Mittwoch, 15. Dezember 2004 09:01 An: Wolf, Michael Betreff: Re: [R] Problem with German special characters Please do look in the list archives: this is a Windows bug worked around a while back. You need to get the R-patched version of R. On Wed, 15 Dec 2004, Wolf, Michael wrote: Dear list! When using the German special characters I didn't see all characters in the correct way. Let's take the command ff - äöüßÄÖÜ # (for those ones who can't see this in the correct way \a, \ö, \ü # \ss (?), \A, \O and \U in LaTeX commands 'ff' will show the output text \344\366üß\304\326\334 on Rconsole. So only ü (\u) and ß (\ss) are shown in the correct way. The other characters are coded in the \ form. I found out that this problem doesn't occur when exporting 'ff' to a text file with 'writeLines'. When opening this file with a text editor you can see all characters in the correct way. What can I do to get the correct output of the special characters? Thanks for your help in advance! Mit freundlichen Grüßen Dr. Michael Wolf Bezirksregierung Münster Dezernat 61 Domplatz 1-348161 Münster Tel.: ++ 49 (02 51) / 4 11 - 17 95 Fax.: ++ 49 (02 51) / 4 11 - 8 17 95 E-Mail: [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] can R do the goodman modified multiple regression method?
the method is described in the article:goodman leo A.,a modified multiple regression approch to analysis of dischotomous variables,american sociological review 33(hebruary):28-46 thank you in advance:) __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] adding perspectives to existing persp plots
Hi Corey Did you try to use par(new=TRUE) before ploting the second persp graph? Cheers Petr On 15 Dec 2004 at 10:44, Corey Bradshaw wrote: I've created a perspective plot using 'persp' in the graphics package. I'd like to add a second plane of z values to the existing plot, but I cannot seem to do this using 'persp'. Is there an analogue to 'lines' or 'points' for perspectives? Corey. [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] question about surf.gls(spatial package) with NA/NaN/inf values
Hi all, I need to fit a trend surface to a 3D-dataset with the surf.gls() function but my data contain missing values. There is just 2 NA in a total of 360 data points but I can't suppresse the variable associated with these NA values to the analysis. The problem is surf.gls can't manage with Na/NaN/inf values, so I wanted to know if a similar function in R can do it. I precise that there's no replicate in the (preliminary test) data so I can't estimate these values them easily. Thanks a lot for help Regards BUHARD Olivier __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Best datatype for time-series with irregular ocurrencies
Hello, Does someone has experience working with time-series where the ocurrencies do not happen at regular time spaces? In one day I can have 20 ocurrencies and in the following day I can have the double. ts datatype must have regular time spaces. What is the best way to put the data if I want to use forecasting methods as Holt-Winters, NN or SVM? To put data and time in different variables? If in one variable it would be of Date datatype? Thank you for any help. Joao - Joao Mendes Moreira Faculdade de Engenharia da Universidade do Porto DEMEGI / GEIN __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] MIME decoding in Mozilla Thunderbird
Hello all! I recently switched mail program to Mozilla Thunderbird, running on W2k. Everything works fine, except the MIME-digests from this list. The decoding doesn´t work properly. I had contact with Martin Maechler some time ago, and he suggested a try on this list for ideas on how to do the decoding in Windows, even if it´s not a proper R-question. Outlook do the proper job on the MIME, but using that is to go a bit far! Or? /CG -- CG Pettersson MSci. PhD.Stud. Swedish University of Agricultural Sciences (SLU) Dep. of Ecology and Crop production sciences (EVP). http://www.slu.se/ [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Advice on parsing formulae
I think this will do what you want: # Need this function to remove spaces from term labels later on removeSpace - function(string) gsub([[:space:]], , string) # Specify which terms are in a tvar group # (could remove spaces separately) tvar - unname(sapply(c(x:A, z, B, poly(v,3)), removeSpace)) # Use terms to get term labels from formula formula - Y ~ 1 + x:A + z + u + B + poly(v,3) term.labels - unname(sapply(attr(terms(formula), term.labels), removeSpace)) tvar [1] x:A z B poly(v,3) term.labels [1] z u B poly(v,3) x:A # Get assign variable for parameters # (You would use first two lines, but I don't have data so defined assign variable myself) #X - model.matrix(formula) #pAssign - attr(X, assign) pAssign - c(0,1,2,3,4,4,4,5,5) # Define tvarAssign tvarAssign - match(pAssign, sort(match(tvar, term.labels))) tvarAssign[is.na(tvarAssign)] - 0 tvarAssign [1] 0 1 0 2 3 3 3 4 4 HTH Heather Mrs H Turner Research Assistant Dept. of Statistics University of Warwick Claus Dethlefsen [EMAIL PROTECTED] 12/13/04 04:10pm Dear list I would like to be able to group terms in a formula using a function that I will call tvar(), eg. the formula Y ~ 1 + tvar(x:A) + tvar(z) + u + tvar(B) + tvar(poly(v,3)) where x,u and v are numeric and A and B are factors - binary, say. As output, I want the model.matrix as if tvar had not been there at all. In addition, I would like to have information on the grouping, as a vector as long as ncol( model.matrix ) with zeros corresponding to terms outside tvar and with an index grouping the terms inside each tvar(). In the (sick) example: model.matrix(Y ~ 1 + tvar(x:A) + tvar(z) + u + tvar(B) + tvar(poly(v,3))) (Intercept) z u B2 poly(v, 3)1 poly(v, 3)2 poly(v, 3)3 x:A1 x:A2 11 -1.55 -1.03 0 0.160 -0.350 -0.281 0.66 0.00 21 -1.08 0.55 0 -0.164 -0.211 0.340 0.91 0.00 31 0.29 -0.26 0 -0.236 -0.073 0.311 -1.93 0.00 41 -1.11 0.96 0 0.222 -0.285 -0.385 -0.23 0.00 51 0.43 -0.76 1 -0.434 0.515 -0.532 0.22 0.00 I would like the vector c(0,1,0,2,3,3,3,4,4) pointing to the tvar-grouped terms. Thus what I would like, looks a bit like the 'assign' attribute of the model.matrix() output. I have not figured out a way of doing this in a nice way and would like some help, please. I hope somebody can help me (or point the manual-pages I should read), Best, Claus Dethlefsen --- Assistant Professor, Claus Dethlefsen, Ph.D. mailto:[EMAIL PROTECTED], http://www.math.auc.dk/~dethlef Dpt. of Mathematical Sciences, Aalborg University Fr. Bajers Vej 7G, 9220 Aalborg East Denmark -- No virus found in this outgoing message. Checked by AVG Anti-Virus. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Massive clustering job?
Hi, I have ~40,000 rows in a database, each of which contains an id column and 20 additional columns of count data. I want to cluster the rows based on these count vectors. Their are ~1.6 billion possible 'distances' between pairs of vectors (cells in my distance matrix), so I need to do something smart. Can R somehow handle this? My first thought was to index the database with something that makes nearest neighbour lookup more efficient, and then use single linkage clustering. Is this kind of index implemented in R (by default when using single linkage)? Also 'grouping' identical vectors is very easy. I tried making groups more fuzzy by using a hashing function over the count vectors, but my hash was too crude. Any way to do fuzzy grouping in R which scales well? For example, removing identical vectors gives me ~30,000 rows (and ~900 million pairs of distances). As an example of how fast I can group, the above query took 0.13 seconds in mysql (using an index over every element in the vector). However, if I tried to calculate a distance between every pair of non identical vectors (lets say I can calculate ~1000 eutlidian distances per second) it would take me ~10 days just to calculate the distance matrix. Sorry for all the information. Any suggestions on how to cluster such a huge dataset (using R) would be appreciated. Cheers, Dan. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Best datatype for time-series with irregular ocurrencies
jmoreira at fe.up.pt writes: : Does someone has experience working with time-series where the ocurrencies do : not happen at regular time spaces? In one day I can have 20 ocurrencies and in : the following day I can have the double. ts datatype must have regular time : spaces. What is the best way to put the data if I want to use forecasting : methods as Holt-Winters, NN or SVM? To put data and time in different : variables? If in one variable it would be of Date datatype? There are several packages that can accommodate irregularly spaced time series. Regarding your specific requirements, zoo is the only one that supports the Date class as the time variable. Also, the upcoming version of zoo (not yet on CRAN) has support for e1071::svm and nnet::nnet . There is a summary of the various irregular time series classes/packages here: https://stat.ethz.ch/pipermail/r-sig-finance/2004q4/000210.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Massive clustering job?
Dear Dan, I would think about transforming your columns in such a way (square root, log?) that methods operating on n*p matrices and assuming roughly elliptical within-clusters distributions such as kmeans or clara, or, after dimension reduction, EMclust or fixmahal can be applied. Maybe you can even do that on untransformed data (take a look at the variable-wise distributions or 2-d scatterplots). You do not need a distance matrix then. Christian On Wed, 15 Dec 2004, Dan Bolser wrote: Hi, I have ~40,000 rows in a database, each of which contains an id column and 20 additional columns of count data. I want to cluster the rows based on these count vectors. Their are ~1.6 billion possible 'distances' between pairs of vectors (cells in my distance matrix), so I need to do something smart. Can R somehow handle this? My first thought was to index the database with something that makes nearest neighbour lookup more efficient, and then use single linkage clustering. Is this kind of index implemented in R (by default when using single linkage)? Also 'grouping' identical vectors is very easy. I tried making groups more fuzzy by using a hashing function over the count vectors, but my hash was too crude. Any way to do fuzzy grouping in R which scales well? For example, removing identical vectors gives me ~30,000 rows (and ~900 million pairs of distances). As an example of how fast I can group, the above query took 0.13 seconds in mysql (using an index over every element in the vector). However, if I tried to calculate a distance between every pair of non identical vectors (lets say I can calculate ~1000 eutlidian distances per second) it would take me ~10 days just to calculate the distance matrix. Sorry for all the information. Any suggestions on how to cluster such a huge dataset (using R) would be appreciated. Cheers, Dan. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** Christian Hennig Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg [EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/ ### ich empfehle www.boag-online.de __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Best datatype for time-series with irregular ocurrencies
On Wed, 15 Dec 2004, Diethelm Wuertz wrote: [EMAIL PROTECTED] wrote: Hello, Does someone has experience working with time-series where the ocurrencies do not happen at regular time spaces? package fBasics from Rmetrics (www.rmetrics.org) has S4 timeDate and timeSeries objects similar to those in SPlus for irregular time series manipulations and investigations. package its is another option Just to be complete: in addition to its in package its timeSeries in fBasics which are S4 classes, there are irts in package tseries zoo in package zoo which are S3 classes for irregularly spaced observations. its is probably the most mature, timeSeries is - as Diethelm said - similar to the S-PLUS implementation and zoo has the advantage that time information can be of (almost) arbitrary class. The development version of zoo is due for release next week or so, contact me off-list if you want the current version. Best, Z Regards Diethelm Wuertz In one day I can have 20 ocurrencies and in the following day I can have the double. ts datatype must have regular time spaces. What is the best way to put the data if I want to use forecasting methods as Holt-Winters, NN or SVM? To put data and time in different variables? If in one variable it would be of Date datatype? Thank you for any help. Joao - Joao Mendes Moreira Faculdade de Engenharia da Universidade do Porto DEMEGI / GEIN __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] german umlaut problem under MacOS
I did not find this in the archive (hope it isn't there...): the current release of R (2.0.1) for MacOS (10.3.6) seems not to handle german special characters like '' correctly: f - '' can be entered at the prompt, but echoing the variable yields [1] \303\274 (I think the unicode of the character) and inserting, for instance text(1,2,f) in some plot seems to insert two characters () (probably an interpretation of the first and second group of the unicode?). I believe, this is a R problem or is there a simple configuration switch? thanks joerg __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to fit a weighted logistic regression?
Kerry Bush wrote: I tried lrm in library(Design) but there is always some error message. Is this function really doing the weighted logistic regression as maximizing the following likelihood: \sum w_i*(y_i*\beta*x_i-log(1+exp(\beta*x_i))) Does anybody know a better way to fit this kind of model in R? FYI: one example of getting error message is like: x=runif(10,0,3) y=c(rep(0,5),rep(1,5)) w=rep(1/10,10) fit=lrm(y~x,weights=w) Warning message: currently weights are ignored in model validation and bootstrapping lrm fits in: lrm(y ~ x, weights = w) although the model can be fit, the above output warning makes me uncomfortable. Can anybody explain about it a little bit? The message means exactly what it says. Model validation in Design currently cannot incorporate weights for lrm. Everything else is OK. Frank Harrell Best wishes, Feixia __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] MIME decoding in Mozilla Thunderbird
CG Pettersson wrote: Hello all! I recently switched mail program to Mozilla Thunderbird, running on W2k. Everything works fine, except the MIME-digests from this list. The decoding doesn´t work properly. I had contact with Martin Maechler some time ago, and he suggested a try on this list for ideas on how to do the decoding in Windows, even if it´s not a proper R-question. Outlook do the proper job on the MIME, but using that is to go a bit far! Or? /CG Thunderbird, which is an otherwise wonderful mail client, does not work for r-help digests. The reason is that if you receive 100 messages in a day, Thunderbird inefficiently handles all the mime 'attachments', and navigating them all is incredibly slow. I didn't have the decoding problem you mentioned though. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R: optimisation
There's also nlm for non-linear newton type minimization, best, ingmar On 12/15/04 3:48 PM, Clark Allan [EMAIL PROTECTED] wrote: hi all other than optim, optimise, and some other related optimisation functions are there any optimisation packages in R? __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] can R do the goodman modified multiple regression method?
Dear ronggui, If memory serves me right, this is a relatively early presentation for sociologists of loglinear models for contingency tables (in which all the variables are dichotomous). You can fit such a model in R as a generalised linear model using glm() or via the loglin() function. The latter, which works by iterative proportional fitting, will be more similar in approach to what's in the Goodman paper (but both produce ML estimates). I hope that this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of ronggui Sent: Tuesday, December 14, 2004 10:03 AM To: R-help Subject: [R] can R do the goodman modified multiple regression method? the method is described in the article:goodman leo A.,a modified multiple regression approch to analysis of dischotomous variables,american sociological review 33(hebruary):28-46 thank you in advance:) __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] question about surf.gls(spatial package) with NA/NaN/infvalues
On Wed, 15 Dec 2004, BUHARD O wrote: Hi all, I need to fit a trend surface to a 3D-dataset with the surf.gls() function but my data contain missing values. There is just 2 NA in a total of 360 data points but I can't suppresse the variable associated with these NA values to the analysis. The problem is surf.gls can't manage with Na/NaN/inf values, so I wanted to know if a similar function in R can do it. I precise that there's no replicate in the (preliminary test) data so I can't estimate these values them easily. Either the z values are missing, and you can use the fit on the 358 points to predict them, or one or other of the point coordinates are missing, so you don't know where the affected z values were observed. You can fit once you omit just those incomplete observations, I think. Thanks a lot for help Regards BUHARD Olivier __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R does not support UTF-8 (was german umlaut problem under MacOS)
Brian D Ripley wrote: You wrote your mail in UTF-8. R does not support UTF-8, and that is both documented and announced on startup in such a locale (at least on OSes with standard-conforming implementations): thanks for clarifying this point. nevertheless: 1. the mail was (on purpose) sent in utf-8 to transport correctly the output from the R command window (i.e. the GUI provided with the macOS port). it is _this_ GUI (sorry for not explaining this correctly in the first place) where the problem occurs. I'm not using (knowingly at least) utf-8. when starting the same binary from the command line in a terminal (where I generally use ISO Latin 1 encoding) it is perfectly possible to get the special characters into variables and into plots. 2. the OS is macos 10.3, i.e. essentially FreeBSD derivative and hopefully conforms to the standardsbu R on startup in the GUI gives only: cut= R : Copyright 2004, The R Foundation for Statistical Computing Version 2.0.1 (2004-11-15), ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for a HTML browser interface to help. Type 'q()' to quit R. R cut= i.e. no announcement whatsoever concerning missing utf-8 support, despite the fact that following input is interpreted in such a way. so, probably this is more a question to the maintainers of the macOS port:_where_ did R (when startet with the GUI) get the notion that it should interpret keyboard input as utf-8? can I change this (it's not in the preferences, for instance)? gannet% env LANG=en_GB.utf8 R R : Copyright 2004, The R Foundation for Statistical Computing Version 2.0.1 (2004-11-15), ISBN 3-900051-07-0 ... WARNING: UTF-8 locales are not currently supported Solution: do not use an unsupported locale. On Wed, 15 Dec 2004, joerg van den hoff wrote: I did not find this in the archive (hope it isn't there...): the current release of R (2.0.1) for MacOS (10.3.6) seems not to handle german special characters like '' correctly: I get two characters (Atilde quarter) here. f - '' can be entered at the prompt, but echoing the variable yields You mean printing the contents, I presume. yes (shell speak). [1] \303\274 (I think the unicode of the character) and inserting, for instance text(1,2,f) in some plot seems to insert two characters () (probably an interpretation of the first and second group of the unicode?). I believe, this is a R problem or is there a simple configuration switch? thanks joerg __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html regards, joerg __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] backspace key doesn't work correctly
When I am running R, the backspace key deletes the whole word instead of one character. Before I start R, and after I exit R, the backspace key works as it should in my xterm terminal window. This gets really annoying since I make a lot of mistakes while typing and don't always want to retype the whole word. This behavior only occurs while running R. Any ideas on how to fix this. Setup: R version: Version 1.9.1 (2004-06-21), ISBN 3-900051-00-3 Terminal Window: xterm Shell: tcsh Thanks for your help. Dan [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ERROR: installing package indices failed
Turns out that this problem is a propagation of a bug from a Depends package - gdata doesn't import stats in its NAMESPACE file, although it makes use of objects defined in stats like reorder and na.omit. I'm curious as to the reason why 'R CMD check' did not detect this at the source package, and instead did not pass my new package. Something for you mighty R developers to consider? Thanks, Sigal Blay On Sun, Nov 21, 2004 at 05:19:46PM +0100, Uwe Ligges wrote: Sigal Blay wrote: gregmisc is installed yet the problem persist. I installed gregmisc using install.packages(c(combinat,gregmisc,genetics),lib='/home/sblay/lib') (on the same library path where I am trying to install LDheatmap) Have you set the environment variable R_LIBS appropriately? Uwe Ligges installed.packages(lib='/home/sblay/lib') Package LibPath Version Priority Bundle combinat combinat /home/sblay/lib 0.0-5 NA NA gdata gdata /home/sblay/lib 2.0.0 NA gregmisc genetics genetics /home/sblay/lib 1.1.1 NA NA gmodels gmodels /home/sblay/lib 2.0.0 NA gregmisc gplotsgplots/home/sblay/lib 2.0.0 NA gregmisc gtoolsgtools/home/sblay/lib 2.0.0 NA gregmisc LDheatmap LDheatmap /home/sblay/lib 1.0 NA NA ... I am developing a package named LDehatmap. It depends on the genetics package and includes two data files and a demo file. When I'm trying to install it, I get the following messages: * Installing *source* package 'LDheatmap' ... ** R ** data ** demo ** help Building/Updating help pages for package 'LDheatmap' Formats: text html latex example LDheatmaptexthtmllatex example ldheatmaptexthtmllatex example Error: object 'reorder' not found whilst loading namespace 'gdata' Error: package 'gdata' could not be loaded Execution halted ERROR: installing package indices failed Any ideas? Yes. You do not have gdata (part of gregmisc) installed, and genetics depends on it. How did you get genetics installed? A binary install? Install gregmisc __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] AUC for logistic regression [was: (no subject)]
My guess is `area under the ROC curve'. There's the roc package in BioConductor that I believe can compute this. Andy From: Spencer Graves What's AUC? If you mean AIC (Akaike Information Criterion), and if you fit logistic regression using glm, the help file says that glm returns an object of class glm, which is a list containing among other things an attribute aic. For example, suppose you fit a model as follows: fit - glm(y~x, famil=binomial()...) Then fit$aic returns the AIC. You may also wish to consider anova and anova.glm. hope this helps. spencer graves [EMAIL PROTECTED] wrote: Dear R-helper, I would like to compare the AUC of two logistic regression models (same population). Is it possible with R ? Thank you Roman Rouzier [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD, Senior Development Engineer O: (408)938-4420; mobile: (408)655-4567 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] AUC for logistic regression [was: (no subject)]
At 17:07 15/12/2004, Spencer Graves wrote: Dear R-helper, I would like to compare the AUC of two logistic regression models (same population). Is it possible with R ? Thank you Roman Rouzier Roman If I understand your question You have 2 ROC curve from same dataset. In this case you can use a routine create for me : seROC-function(AUC,na,nn){ a-AUC q1-a/(2-a) q2-(2*a^2)/(1+a) se-sqrt((a*(1-a)+(na-1)*(q1-a^2)+(nn-1)*(q2-a^2))/(nn*na)) se } cROC-function(AUC1,na1,nn1,AUC2,na2,nn2,r){ se1-seROC(AUC1,na1,nn1) se2-seROC(AUC2,na2,nn2) sed-sqrt(se1^2+se2^2-2*r*se1*se2) zad-(AUC1-AUC2)/sed p-dnorm(zad) a-list(zad,p) a } The first function (seROC) calculate teh standart error of ROC curve, the second function (cROC) compare ROC curves . The parameters: AUC - area under curve na - number of positives results nn - number total tests (positives +negatives) r - correlation of two numeric variables Best wishes Bernardo Rangel Tura, MD, MSc National Institute of Cardiology Laranjeiras Rio de Janeiro Brazil -- No virus found in this outgoing message. Checked by AVG Anti-Virus. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] AUC for logistic regression [was: (no subject)]
Joe Nocera wrote: I believe that Roman is referring to AUC as the Area Under Curve from a Receiver Operating Characteristic. If this indeed your quantity of interest - it can be calculated in R. You can download code at: http://www.bioconductor.org/repository/release1.5/package/Win32/ and/or http://biostat.ku.dk/~bxc/SPE/library/ Check out the archives - I'm sure there is more there if you search ROC instead. Cheers, Joe Quoting Spencer Graves [EMAIL PROTECTED]: What's AUC? If you mean AIC (Akaike Information Criterion), and if you fit logistic regression using glm, the help file says that glm returns an object of class glm, which is a list containing among other things an attribute aic. For example, suppose you fit a model as follows: fit - glm(y~x, famil=binomial()...) Then fit$aic returns the AIC. You may also wish to consider anova and anova.glm. hope this helps. spencer graves [EMAIL PROTECTED] wrote: Dear R-helper, I would like to compare the AUC of two logistic regression models (same population). Is it possible with R ? Thank you Roman Rouzier [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD, Senior Development Engineer O: (408)938-4420; mobile: (408)655-4567 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Joseph J. Nocera AUC is standard output in the lrm function in the Design package (the C Index). validate.lrm computes the overfitting-corrected C index. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] which
Why the last which in the following example doesn't work? Is there a simple way to identify the indices of array elements that meet multiple criteria? Thanks. YC Tao x-1:10 which(x5) [1] 1 2 3 4 which(x2) [1] 3 4 5 6 7 8 9 10 which(x5 x2) numeric(0) __ Dress up your holiday email, Hollywood style. Learn more. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] which
which(x 5 x 2) [1] 3 4 From the help page for Logical Operators: ?! '' and '' indicate logical AND and '|' and '||' indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined. The longer form is appropriate for programming control-flow and typically preferred in 'if' clauses. Y. C. Tao wrote: Why the last which in the following example doesn't work? Is there a simple way to identify the indices of array elements that meet multiple criteria? Thanks. YC Tao x-1:10 which(x5) [1] 1 2 3 4 which(x2) [1] 3 4 5 6 7 8 9 10 which(x5 x2) numeric(0) -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] TukeyHSD Covariates
Dear R gurus, I have the following model: appcov.aov - aov(yield ~ prevyield + trt + block) where prevyield is a continuous numeric covariate and trt and block are factors (yes, I did factor()!) Now, when I do a TukeyHSD, my diff's are all screwed up! For instance: treatment mean for treatmen E is 277.25 and for treatment O is 279.5, so I figure the diff O-E should be 2.25, but TukeyHSD says: diff lwrupr O-E -50.817101 -84.8112057 -16.822996 So I wonder where is that -50.8 coming from??? Anybody have a clue? Thanks a lot! PS: it works if I take prevyield (the covariate) out of the model, but the point is I need to analyse it with the covariate. Thanks again __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] TukeyHSD Covariates
Here's what I get: summary(lm(yield ~ prevyield + trt + block)) Call: lm(formula = yield ~ prevyield + trt + block) Residuals: Min 1Q Median 3Q Max -22.616 -9.254 2.051 10.687 19.421 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 38.579726.5447 1.453 0.16816 prevyield28.4010 3.3840 8.393 7.8e-07 *** trtB-13.909911.7844 -1.180 0.25752 trtC -6.409911.7844 -0.544 0.59505 trtD 0.660511.9128 0.055 0.95657 trtE 20.440912.2329 1.671 0.11692 trtO-29.140812.1256 -2.403 0.03068 * block21.6396 9.7127 0.169 0.86836 block3 -22.688611.6961 -1.940 0.07282 . block4 44.777612.7351 3.516 0.00342 ** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 16.66 on 14 degrees of freedom Multiple R-Squared: 0.9461, Adjusted R-squared: 0.9114 F-statistic: 27.29 on 9 and 14 DF, p-value: 2.276e-07 What does R consider balanced anyway? I've had data with the same obs per trt and R complains about it being unbalanced... Yeah, the covariate is in bushels and the yield is in pounds... but I don't get why the means of the models with and without covariate would change. The SE's are another story, but the means? Thanks Peter Dalgaard wrote: Damián Cirelli [EMAIL PROTECTED] writes: Dear R gurus, I have the following model: appcov.aov - aov(yield ~ prevyield + trt + block) where prevyield is a continuous numeric covariate and trt and block are factors (yes, I did factor()!) Now, when I do a TukeyHSD, my diff's are all screwed up! For instance: treatment mean for treatmen E is 277.25 and for treatment O is 279.5, so I figure the diff O-E should be 2.25, but TukeyHSD says: diff lwrupr O-E -50.817101 -84.8112057 -16.822996 So I wonder where is that -50.8 coming from??? Anybody have a clue? Thanks a lot! PS: it works if I take prevyield (the covariate) out of the model, but the point is I need to analyse it with the covariate. Thanks again If the covariate level differs between the treatment groups, then the difference in the covariate-adjusted means could well differ quite a bit from the unadjusted difference. What happens if you do summary(lm(yield ~ prevyield + trt + block)) (Not sure I'm happy about using the HSD procedure with an unbalanced design, btw.) [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how R outputs?
Hi.All and R developers: When I look into the R source code, I have a question.Since R has its own data structure(i.e. SEXP),how does it convert the result to the normal output after it has computed? For example,when I input, abs(-3) I learned that in R's execution, the expression is parsed to a parse tree,and becomes a SEXP list. After eval function, the result is still a SEXP. But R outputs: [1] 3 The output is normal.So my question is how R makes its SEXP result into the normal result.Where can I find the place R makes this convertion in R's source code?Can anyone help me? thanks dongyuan xu __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Best datatype for time-series with irregular ocurrencies
[EMAIL PROTECTED] wrote: Hello, Does someone has experience working with time-series where the ocurrencies do not happen at regular time spaces? package fBasics from Rmetrics (www.rmetrics.org) has S4 timeDate and timeSeries objects similar to those in SPlus for irregular time series manipulations and investigations. package its is another option Regards Diethelm Wuertz In one day I can have 20 ocurrencies and in the following day I can have the double. ts datatype must have regular time spaces. What is the best way to put the data if I want to use forecasting methods as Holt-Winters, NN or SVM? To put data and time in different variables? If in one variable it would be of Date datatype? Thank you for any help. Joao - Joao Mendes Moreira Faculdade de Engenharia da Universidade do Porto DEMEGI / GEIN __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] repeated measures with Poisson
I'm pretty new to R and I have a stats problem that's got me baffled. I'm trying to carry out a repeated measures test on Poission distributed data, where half the subjects in each block (4 blocks) were treated, and half were controls. Measurements were carried out before and after the treatment. There was another 2-level factor included. The trouble is I can't take averages, and have to include the identity of each subject in each block. Can anyone help? Andy Higginson This message has been scanned but we cannot guarantee that it and any attachments are free from viruses or other damaging content: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Massive clustering job?
It sounds like clara in package cluster might help. Regards, Matt Wiener -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dan Bolser Sent: Wednesday, December 15, 2004 6:37 AM To: R mailing list Subject: [R] Massive clustering job? Hi, I have ~40,000 rows in a database, each of which contains an id column and 20 additional columns of count data. I want to cluster the rows based on these count vectors. Their are ~1.6 billion possible 'distances' between pairs of vectors (cells in my distance matrix), so I need to do something smart. Can R somehow handle this? My first thought was to index the database with something that makes nearest neighbour lookup more efficient, and then use single linkage clustering. Is this kind of index implemented in R (by default when using single linkage)? Also 'grouping' identical vectors is very easy. I tried making groups more fuzzy by using a hashing function over the count vectors, but my hash was too crude. Any way to do fuzzy grouping in R which scales well? For example, removing identical vectors gives me ~30,000 rows (and ~900 million pairs of distances). As an example of how fast I can group, the above query took 0.13 seconds in mysql (using an index over every element in the vector). However, if I tried to calculate a distance between every pair of non identical vectors (lets say I can calculate ~1000 eutlidian distances per second) it would take me ~10 days just to calculate the distance matrix. Sorry for all the information. Any suggestions on how to cluster such a huge dataset (using R) would be appreciated. Cheers, Dan. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RE: adding perspectives to existing persp plots
On Wednesday 15 December 2004 01:10, Corey Bradshaw wrote: Thanks, Romain. I've certainly used that to draw lines and points in the plots produced by 'persp'; however, my problem is that I need to incorporate an entirely new z function (not just a plane) onto the same plot (i.e., using the same x and y values). If the surfaces are non-intersecting, you might be able to use 'wireframe' from the lattice package. See the second example in ?wireframe. Deepayan __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R: optimisation
hi all other than optim, optimise, and some other related optimisation functions are there any optimisation packages in R?__ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R does not support UTF-8 (was german umlaut problem under MacOS)
You wrote your mail in UTF-8. R does not support UTF-8, and that is both documented and announced on startup in such a locale (at least on OSes with standard-conforming implementations): gannet% env LANG=en_GB.utf8 R R : Copyright 2004, The R Foundation for Statistical Computing Version 2.0.1 (2004-11-15), ISBN 3-900051-07-0 ... WARNING: UTF-8 locales are not currently supported Solution: do not use an unsupported locale. On Wed, 15 Dec 2004, joerg van den hoff wrote: I did not find this in the archive (hope it isn't there...): the current release of R (2.0.1) for MacOS (10.3.6) seems not to handle german special characters like 'ü' correctly: I get two characters (Atilde quarter) here. f - 'ü' can be entered at the prompt, but echoing the variable yields You mean printing the contents, I presume. [1] \303\274 (I think the unicode of the character) and inserting, for instance text(1,2,f) in some plot seems to insert two characters (√º) (probably an interpretation of the first and second group of the unicode?). I believe, this is a R problem or is there a simple configuration switch? thanks joerg __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] MIME decoding in Mozilla Thunderbird
CG == CG Pettersson [EMAIL PROTECTED] on Wed, 15 Dec 2004 16:43:38 +0100 writes: CG Ok, the time consumption could be a problem, I find it quite okay though. CG My problem is to reach the actual messages. CG If I use the option File/Attechments/Open, the window that opens just CG contain a list of all headers from the last 24 hours (from which I just CG choosed one - as I thought) CG If I on the other hand use File/Attechments/Save, I can save the choosen CG message as an .eml file. Why can´t I open it? CG I realise this question should be on the Thunderbird list, but I have a CG feeling the coding in the MIME-digests of this list are unusually advanced. I don't think so. AFAIK there are thousands of mailman-operated mailing lists out there, and all of them use the same MIME digestification. Consequently, I now think that you could pose this question / problem on the 'mailman-users' mailing list as well and may get other suggestions. (When you do, please indicate that we use mailman 2.1.5 which is the latest released version). Martin CG Frank E Harrell Jr wrote: CG Pettersson wrote: Hello all! I recently switched mail program to Mozilla Thunderbird, running on W2k. Everything works fine, except the MIME-digests from this list. The decoding doesn´t work properly. I had contact with Martin Maechler some time ago, and he suggested a try on this list for ideas on how to do the decoding in Windows, even if it´s not a proper R-question. Outlook do the proper job on the MIME, but using that is to go a bit far! Or? /CG Thunderbird, which is an otherwise wonderful mail client, does not work for r-help digests. The reason is that if you receive 100 messages in a day, Thunderbird inefficiently handles all the mime 'attachments', and navigating them all is incredibly slow. I didn't have the decoding problem you mentioned though. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R: optimisation
You don't say what sort of optimisation you have in mind. If you are looking for something that will handle non-standard problems, you can have a look at 'genopt' from S Poetry. Patrick Burns Burns Statistics [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Clark Allan wrote: hi all other than optim, optimise, and some other related optimisation functions are there any optimisation packages in R? __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to fit a weighted logistic regression?
I was going to say ``Why not just use glm()?'', but when I tried the example given in the original message I got a different but similarly nervous-making warning: Warning in eval(expr, envir, enclos) : non-integer #successes in a binomial glm! Looking into the code I found that the warning originates in binomial()$initialize in the lines: m - weights * y if (any(abs(m - round(m)) 0.001)) warning(non-integer #successes in a binomial glm!) I also noticed that if y is given as a two column matrix (successes, and failures) then the check for non-integer values in y gets done without multiplying anything by the weights, and so y passes the check and no warning is issued. I.e. f1 - glm(y~x,weights=w,family=binomial) causes a warning, but f2 - glm(cbind(y,1-y)~x,weights=w,family=binomial) does not. The fits f1 and f2 appear to be the same, although they differ in the number of iterations, and by an order of e-8 in the coefficients and the scaled and unscaled covariance. So is that warning which arises in the ``f1'' case actually appropriate? cheers, Rolf Turner [EMAIL PROTECTED] ===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+=== Original message: I tried lrm in library(Design) but there is always some error message. Is this function really doing the weighted logistic regression as maximizing the following likelihood: \sum w_i*(y_i*\beta*x_i-log(1+exp(\beta*x_i))) Does anybody know a better way to fit this kind of model in R? FYI: one example of getting error message is like: x=runif(10,0,3) y=c(rep(0,5),rep(1,5)) w=rep(1/10,10) fit=lrm(y~x,weights=w) Warning message: currently weights are ignored in model validation and bootstrapping lrm fits in: lrm(y ~ x, weights = w) although the model can be fit, the above output warning makes me uncomfortable. Can anybody explain about it a little bit? Best wishes, Feixia __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] (no subject)
Dear R-helper, I would like to compare the AUC of two logistic regression models (same population). Is it possible with R ? Thank you Roman Rouzier [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] AUC for logistic regression [was: (no subject)]
What's AUC? If you mean AIC (Akaike Information Criterion), and if you fit logistic regression using glm, the help file says that glm returns an object of class glm, which is a list containing among other things an attribute aic. For example, suppose you fit a model as follows: fit - glm(y~x, famil=binomial()...) Then fit$aic returns the AIC. You may also wish to consider anova and anova.glm. hope this helps. spencer graves [EMAIL PROTECTED] wrote: Dear R-helper, I would like to compare the AUC of two logistic regression models (same population). Is it possible with R ? Thank you Roman Rouzier [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD, Senior Development Engineer O: (408)938-4420; mobile: (408)655-4567 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] using Hmisc and Design library
Hi all, I encountered a weird problem when using the Design and Hmisc libraries in S-Plus (it worked well in R). I have a data frame called b, which has 3 columns: time, status and scores, a sample of the data frame is like: data frame b: time status scores 1 27 0 -126.7 2 24 0 -135.6 3 30 0 -139.5 4 49 0 -137.6 5 27 0 -136.9 when I ran the following script using this data frame, even though no error message was produced, no fit object was generated: library(Hmisc,T);library(Design,T) dd - datadist(b) options(datadist='dd') fit - cph(Surv(time,status) ~ scores, data=b,surv=T, x=T, y=T) fit Problem: Object fit not found, while calling subroutine S_agsurv2 Use traceback() to see the call stack actually data frame b has 177 rows, the script ran ok on the first 166 rows as a subset, but started to break down if subset of the first 177 rows were used as the input, or the first 166 rows plus 168th row, the data in those rows in b are: time status scores 165 172 0 -123.3 166 105 0 -138.4 167 166 0 -128.8 168 140 0 -114.2 169 163 0 -117.0 170 141 0 -115.8 Additionally, even if I only ran the script on the first 166 rows, I still can't generate a plot: dd - datadist(bbb[1:166,]) options(datadist='dd') fit - cph(Surv(time,status) ~ scores, data=bbb[1:166,],surv=T, x=T, y=T) fit Cox Proportional Hazards Model cph(formula = Surv(time, status) ~ scores, data = bbb[1:166, ], x = T, y = T, surv = T) Obs Events Model L.R. d.f. P Score Score PR2 166 36 29.081 0 35.37 0 0.182 coef se(coef)z p scores 0.102 0.0172 5.94 2.91e-009 plot(fit, scores=seq(-140, -100, by=1), time=36,fun=function(x) 1-x,xlim=c(-140, -100),ylim=c(0,1),lwd=3,xlab='Scores',ylab='Probability at 3 Years') no error message, but only a blank graph window is produced. can anyone please tell me why this happens only in S-Plus, but not in R? no missing value is present in either data frame. the data b is attached in case you need to run the script. Thanks very much! __ row.names timestatus scores 1 27 0 -126.706775142792 2 24 0 -135.646938397462 3 30 0 -139.508762843248 4 49 0 -137.551209881948 5 27 0 -136.868192376342 6 41 0 -135.331913852558 7 31 0 -131.954779047339 8 27 0 -131.524681661997 9 37 0 -131.786118789511 10 38 1 -98.9746000656833 11 29 1 -105.533478698214 12 31 1 -112.317311438571 13 49 1 -106.60935095037 14 51 0 -127.040652771522 15 53 0 -125.818847128221 16 76 0 -122.973527977967 17 77 0 -129.84742262956 18 41 0 -126.759206020033 19 62 0 -124.688165345946 20 52 0 -121.724878244943 21 83 0 -127.065353492987 22 10 1 -105.019394502807 23 60 0 -126.590791227224 24 56 0 -128.398971375234 25 43 0 -122.109019588711 26 17 1 -98.1976727902452 27 107 0 -126.545160192436 28 21 1 -121.472686960897 29 38 0 -120.93978698714 30 51 0 -117.109857031147 31 48 0 -133.322549387909 32 78 0 -124.877928079306 33 55 0 -128.908553897764 34 53 0 -130.185734686138 35 10 1 -115.375584653794 36 5 1 -116.131928609306 37 21 1 -116.453942436335 38 24 1 -123.020079653056 39 24 1 -131.311964779875 40 24 1 -117.748209513549 41 37 0 -127.985941925825 42 53 0 -124.534919208761 43 54 1 -138.345612101095 44 35 1 -121.365208594489 45 60 0 -133.593779020529 46 69 0 -130.449420655907 47 74 0 -135.183874932554 48 48 0 -139.167445404491 49 54 0 -130.900766393426 50 55 0 -115.834160342219 51 69 0 -124.397447684606 52 17 1 -115.232780851337 53 24 1 -123.652687449155 54 24 1 -113.79362358954 55 14 1 -125.077495747215 56 15 1 -123.39550171375 57 6 1 -120.703960854828 58 27 1 -122.593559899788 59 19 1 -115.508773940805 60 35 1 -131.602555921887 61 68 1 -115.631506755249 62 42 1 -128.315195343203 63 47 1 -123.511731764928 64 13 1 -127.655777153564 65 25 1 -123.503248471324 66 15 1 -121.266145049005 67 44 1
RE: [R] AUC for logistic regression [was: (no subject)]
AUC means Area Under the Curve and is a common summary statistic for repeated measures experiments (e.g. repeated measurements of serum concentration of a drug in an individual)in PK/PD studies (pharmacokinetic/pharmacodynamic). I think the poster may actually mean nonlinear regression for logistic models rather than logistic regression, which, of course, has a different statistical meaning. Hence nonlinear regression modeling for repeated measures is probably the issue here, for which nlme() is appropriate, I think. Alternatively, and perhaps somewhat less statistically desirable (though perhaps fairly standard in PK/PD modeling), one can use nls() or perhaps nlsList() to fit each individual's curve and compute the AUC's. So in any case, the answer appears to be yes, you can use R perhaps with add-ons to do what you want. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Spencer Graves Sent: Wednesday, December 15, 2004 11:07 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [R] AUC for logistic regression [was: (no subject)] What's AUC? If you mean AIC (Akaike Information Criterion), and if you fit logistic regression using glm, the help file says that glm returns an object of class glm, which is a list containing among other things an attribute aic. For example, suppose you fit a model as follows: fit - glm(y~x, famil=binomial()...) Then fit$aic returns the AIC. You may also wish to consider anova and anova.glm. hope this helps. spencer graves [EMAIL PROTECTED] wrote: Dear R-helper, I would like to compare the AUC of two logistic regression models (same population). Is it possible with R ? Thank you Roman Rouzier [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD, Senior Development Engineer O: (408)938-4420; mobile: (408)655-4567 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] AUC for logistic regression [was: (no subject)]
I believe that Roman is referring to AUC as the Area Under Curve from a Receiver Operating Characteristic. If this indeed your quantity of interest - it can be calculated in R. You can download code at: http://www.bioconductor.org/repository/release1.5/package/Win32/ and/or http://biostat.ku.dk/~bxc/SPE/library/ Check out the archives - I'm sure there is more there if you search ROC instead. Cheers, Joe Quoting Spencer Graves [EMAIL PROTECTED]: What's AUC? If you mean AIC (Akaike Information Criterion), and if you fit logistic regression using glm, the help file says that glm returns an object of class glm, which is a list containing among other things an attribute aic. For example, suppose you fit a model as follows: fit - glm(y~x, famil=binomial()...) Then fit$aic returns the AIC. You may also wish to consider anova and anova.glm. hope this helps. spencer graves [EMAIL PROTECTED] wrote: Dear R-helper, I would like to compare the AUC of two logistic regression models (same population). Is it possible with R ? Thank you Roman Rouzier [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD, Senior Development Engineer O: (408)938-4420; mobile: (408)655-4567 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Joseph J. Nocera Ph.D. Candidate NB Coop. Fish Wildlife Research Unit Biology Department - Univ. New Brunswick Fredericton, NB Canada E3B 6E1 tel: (902) 679-5733 Why does it have to be spiders? Why can't it be 'follow the butterflies'?! Ron Weasley, Harry Potter The Chamber of Secrets __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] main() in libR?
libR seems to include a main() function. Should it? I'm hoping this simpler question will get more of a response than my previous post about R a la carte :) The evidence that libR includes main() is two-fold. First, when I use nm, I see the _main defined. Second, when I run a program linked to libR, the main R shell takes over, even though I have my own main (in another library). Looking at the sources, it seems the only possible contributor of main() is src/main/Rmain.c. I wouldn't expect this in the library, and the documentation of the library (in R Extensions) says you must supply your own main. I have tried to follow through the Makefile's, but the spot the library is built has not jumped out at me. I've encountered this with number of generations of R on a number of platforms (Debian Linux and OS X). Thanks. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] which
Because you have one too many `'. Andy From: Y. C. Tao Why the last which in the following example doesn't work? Is there a simple way to identify the indices of array elements that meet multiple criteria? Thanks. YC Tao x-1:10 which(x5) [1] 1 2 3 4 which(x2) [1] 3 4 5 6 7 8 9 10 which(x5 x2) numeric(0) __ Dress up your holiday email, Hollywood style. Learn more. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] hclust and heatmap - slightly different dendrograms?
Good afternoon, I ran heatmap and hclust on the same matrix x (strictly, I ran heatmap(x), and hclust(dist(t(x))), and realized that the two dendrograms were slightly different, in that the left-right arrangement of one pair of subclusters (columns) was reversed in the two functions (but all individual columns were grouped correctly). Looking through the code for heatmap as a most definite nonexpert, it seems to me that hclust is also invoked by heatmap. heatmap function (x, Rowv = NULL, Colv = if (symm) Rowv else NULL, distfun = dist, hclustfun = hclust, add.expr, symm = FALSE, ... hcr - hclustfun(distfun(x)) ddr - as.dendrogram(hcr) hcc - hclustfun(distfun(if (symm) x else t(x))) ddc - as.dendrogram(hcc) I understand it is possible to add Rowv=NA and order the samples as per hclust, but I'm just wondering if there is a reason for this observation. Any pointers would be very much appreciated. Thanks! Min-Han Tan __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] hclust and heatmap - slightly different dendrograms?
Hierarchical clustering does NOT give an ordering for the clusters. It only gives the clustering, so order is not invariant. Sean - Original Message - From: Min-Han Tan [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, December 15, 2004 3:55 PM Subject: [R] hclust and heatmap - slightly different dendrograms? Good afternoon, I ran heatmap and hclust on the same matrix x (strictly, I ran heatmap(x), and hclust(dist(t(x))), and realized that the two dendrograms were slightly different, in that the left-right arrangement of one pair of subclusters (columns) was reversed in the two functions (but all individual columns were grouped correctly). Looking through the code for heatmap as a most definite nonexpert, it seems to me that hclust is also invoked by heatmap. heatmap function (x, Rowv = NULL, Colv = if (symm) Rowv else NULL, distfun = dist, hclustfun = hclust, add.expr, symm = FALSE, ... hcr - hclustfun(distfun(x)) ddr - as.dendrogram(hcr) hcc - hclustfun(distfun(if (symm) x else t(x))) ddc - as.dendrogram(hcc) I understand it is possible to add Rowv=NA and order the samples as per hclust, but I'm just wondering if there is a reason for this observation. Any pointers would be very much appreciated. Thanks! Min-Han Tan __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] TukeyHSD Covariates
Damián Cirelli [EMAIL PROTECTED] writes: Dear R gurus, I have the following model: appcov.aov - aov(yield ~ prevyield + trt + block) where prevyield is a continuous numeric covariate and trt and block are factors (yes, I did factor()!) Now, when I do a TukeyHSD, my diff's are all screwed up! For instance: treatment mean for treatmen E is 277.25 and for treatment O is 279.5, so I figure the diff O-E should be 2.25, but TukeyHSD says: diff lwrupr O-E -50.817101 -84.8112057 -16.822996 So I wonder where is that -50.8 coming from??? Anybody have a clue? Thanks a lot! PS: it works if I take prevyield (the covariate) out of the model, but the point is I need to analyse it with the covariate. Thanks again If the covariate level differs between the treatment groups, then the difference in the covariate-adjusted means could well differ quite a bit from the unadjusted difference. What happens if you do summary(lm(yield ~ prevyield + trt + block)) (Not sure I'm happy about using the HSD procedure with an unbalanced design, btw.) -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] TukeyHSD Covariates
I have interactions where I shouldn't so nevermind, I'm a dumb ass. Thanks again __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] RE: adding perspectives to existing persp plots
Hi there, Turns out it was a simply 'par' problem I just needed to establish par(new = TRUE) after the first 'persp' plot and prior to coding the 2nd 'persp' plot. Thanks for your suggestions. Corey -Original Message- From: Deepayan Sarkar [mailto:[EMAIL PROTECTED] Sent: Thursday, December 16, 2004 12:14 AM To: [EMAIL PROTECTED] Cc: Corey Bradshaw Subject: Re: [R] RE: adding perspectives to existing persp plots On Wednesday 15 December 2004 01:10, Corey Bradshaw wrote: Thanks, Romain. I've certainly used that to draw lines and points in the plots produced by 'persp'; however, my problem is that I need to incorporate an entirely new z function (not just a plane) onto the same plot (i.e., using the same x and y values). If the surfaces are non-intersecting, you might be able to use 'wireframe' from the lattice package. See the second example in ?wireframe. Deepayan __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html