Re: [R] artificial data matrix with 100000 rows
PS == Paul Smith [EMAIL PROTECTED] on Sun, 9 Sep 2007 12:17:32 +0100 writes: PS On 9/9/07, kevinchang [EMAIL PROTECTED] wrote: I tried to made the matrix with this size by either matrix() or array(). However, there seems to be default limit of number for rows made. I got sort of error message from R .To be specific, m--matrix(ncol=3,nrow=10) error message:[ reached getOption(max.print) -- omitted 7 rows ]] or a-array(dim=c(1,3,10)) error message:reached getOption(max.print) -- omitted 6667 row(s) and 6 matrix slice(s) ] PS That is not an error message, I guess. Definitely not, thank you, Paul! Also, they were not produced by what Kevin showed (namely assignments) but rather when he *prints* the contents of his huge matrix / array. PS When the matrices are huge, R is unable to print them PS totally on the screen, but all data are present. Not at all unable !! R protects you from accidentally overflowing your console with huge amount of non-sensical output. As the warning above mentions, you should look at ? getOption ? options and particularly the 'max.print' option Is '' reached getOption(max.print) '' too difficult to read? You *can* increase the 'max.print' option as much as you like, and that's why I said 'not at all unable' above. Regards, Martin PS For instance, m[(nrow(m)-10):nrow(m),] PS [,1] [,2] [,3] PS [1,] NA NA NA PS [2,] NA NA NA PS [3,] NA NA NA PS [4,] NA NA NA PS [5,] NA NA NA PS [6,] NA NA NA PS [7,] NA NA NA PS [8,] NA NA NA PS [9,] NA NA NA PS [10,] NA NA NA PS [11,] NA NA NA or rather just tail(m) or tail(m, 11) or head(m) or str(m) etc etc PS See PS ?getOption yes indeed. Martin PS Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matlab's lsqnonlin
KateM == Katharine Mullen [EMAIL PROTECTED] on Fri, 7 Sep 2007 20:07:41 +0200 (CEST) writes: KateM The thread you linked to regarding Levenberg-Marquardt's supposed lack of KateM availability is from 2001; it has been possible to get KateM to the MINPACK implementation of Levenberg-Marquardt within R via the KateM package minpack.lm KateM (http://cran.r-project.org/src/contrib/Descriptions/minpack.lm.html) since KateM 2005. Thanks a lot, Kate. I'm wondering about experiences: Do you know of cases where minpack.lm's nls.lm() solved a (real) problem that nls() would have a problem with ? Beware however -- one of the main things I learned about this field from Doug Bates, co-author of Bates_and_Watts and prinicipal author of S's and R's nls() : It's a *feature* that nls() does not converge sometimes when other methods do falsely claim convergence! Martin Maechler, ETH Zurich KateM KateM Katharine Mullen KateM mail: Department of Physics and Astronomy, Faculty of Sciences KateM Vrije Universiteit Amsterdam, de Boelelaan 1081 KateM 1081 HV Amsterdam, The Netherlands KateM room: T.1.06 KateM tel: +31 205987870 KateM fax: +31 205987992 KateM e-mail: [EMAIL PROTECTED] KateM homepage: http://www.nat.vu.nl/~kate/ KateM On Fri, 7 Sep 2007, Jose Luis Aznarte M. wrote: Hi! I'm translating some code from Matlab to R and I found a problem. I need to translate Matlab's function 'lsqnonlin' (http://www-ccs.ucsd.edu/matlab/toolbox/optim/lsqnonlin.html) into R, and at the beginning I thought it would be the same as R's 'optim'. But then I looked at the definition of 'lsqnonlin' and I don't quite see how to make 'optim' to do the same thing. Does anyone have an idea? This is apart from the fact that I would like to use the Levenberg Marquardt algorithm which is not implemented in R (some discussion about this: http://tolstoy.newcastle.edu.au/R/help/00b/2492.html). Thank you! All the best, -- -- Jose Luis Aznarte M. http://decsai.ugr.es/~jlaznarte Department of Computer Science and Artificial Intelligence Universidad de Granada Tel. +34 - 958 - 24 04 67 GRANADA (Spain) Fax: +34 - 958 - 24 00 79 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. KateM __ KateM R-help@stat.math.ethz.ch mailing list KateM https://stat.ethz.ch/mailman/listinfo/r-help KateM PLEASE do read the posting guide http://www.R-project.org/posting-guide.html KateM and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] integers
CH == Christoph Heibl [EMAIL PROTECTED] on Tue, 4 Sep 2007 11:53:43 +0200 writes: CH Hi list, CH The function is.integer tests if an object is of type integer: CH see e.g.: CH is.integer(12) # FALSE CH is.real(12) # TRUE CH mode(12)# numeric CH But how can I test if a number is actually an integer? something likeround(x) == x is often good enough, maybe x %% 1 == 0 seems a bit more efficient. Note that both return NA whenever x[] is NA so may not directly be appropriate for your use case. CH R seek is difficult to search in this case because it mainly yields entries CH about the integer()-function family. R seek ??? Do you mean the R function RSiteSearch() which goes to 'http://search.r-project.org/' ? Well, calling RSiteSearch(integer number) gives almost 3000 hits, *but* number 10 is exactly relevant to your question... CH Thanks for any hint! CH Christoph Heibl __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Incomplete Gamma function
AN == Anup Nandialath [EMAIL PROTECTED] on Fri, 31 Aug 2007 13:15:08 -0700 (PDT) writes: AN Hi Kris, You just need to understand the mathematics of AN the incomplete gamma function and the various AN relationships it has. The answers from both Mathematica AN and R are correct, except that they are giving you AN different estimated quantities. It depends on the way AN the gamma function is written. For instance in R to get AN the equivalent result from mathematica you should do the AN following AN answer - gamma(9) - Igamma(9,11.1). This will give you AN the incomplete gamma for (9,11.1) as given by AN Mathematica. AN You can read more about the model and am sure you will AN figure it out. Yes, and then, *PLEASE* do as Brian Ripley suggested, and understand that the (normalized) incomplete gamma function is the same as the gamma distribution functions and has been available in S and R for ever __and__ in R has been very thoroughly tested and quite a few extreme cases have been made to work more accurately, etc etc: pgamma() ! BTW: The same applies to the incomplete beta function which -- in one of it's equivalent forms -- is called the beta distribution function in probability and statistics and has been available in S and R for ever, and for R, has been very carefully tested and for extreme border cases gradually improved over the years, most recently for the upcoming R 2.6.0, where the precision of pbeta(*, log=TRUE) has been dramatically improved in some extreme tail/range cases. -- which benefits pt(), pf(), pbinom() etc in equivalent situations. Martin Maechler, ETH Zurich and R-core AN Anup AN - Got a little couch AN potato? Check out fun summer activities for kids. AN [[alternative HTML version deleted]] AN __ AN R-help@stat.math.ethz.ch mailing list AN https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do AN read the posting guide AN http://www.R-project.org/posting-guide.html and provide AN commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R mailinglist ETH website now with trusted certificate
Some of you, notably I think users of MS Internet Explorer, may be happy to learn that since yesterday, the web server of the R mailing lists https://stat.ethz.ch/ now runs an ``official'' / trusted certificate as opposed to the inofficial one (Math deparment ETH) that we have had for years instead. In particular, this should make access to the (first but by far not only) mailing list archives more convenient to you, e.g. for this month, for R-help, https://stat.ethz.ch/pipermail/r-help/2007-August/thread.html In parallel, the R Foundation is getting (buying) certificates for several R-project.org servers, notably also the subversion (R source) server, and these will hopefully be put in place as well with the next few weeks. Martin Maechler, ETH Zurich R core __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] library(fCalendar) timeDate(12.03.2005, format=%d.%m.%Y)
OL == Ola Lindqvist [EMAIL PROTECTED] on Tue, 21 Aug 2007 14:32:19 +0200 writes: OL Thanks! OL Seems to work fine now! Well, for your example. But sorry to say, the patch breaks other cases. I'm investigating further (and will hopefully contribute to a new CRAN release of fCalendar once Diethelm Wuertz is back from wherever; I've already made more changes) Martin Maechler, ETH Zurich [but different department than D.Wuertz] OL Martin Becker wrote: Dear Ola, I think you spotted a small bug in *package* fCalendar. Explicit specification should prevent autodetection of the date format, which is not the case for fCalendar v251.70, instead autodetection is done at least once (twice, if actually appropriate). With the following patch, things should work ok: diff --recursive fCalendar.orig/R/3A-TimeDateClass.R fCalendar/R/3A-TimeDateClass.R 433c433 charvec = format(strptime(charvec, .whichFormat(charvec)), isoFormat) --- charvec = format(strptime(charvec, format), isoFormat) You did not provide the output of sessionInfo() (which you are asked for in the posting guide). If you are using Windows and don't know how to apply the patch, you can download a patched binary version here: http://www.saar-gate.net/download/fCalendar_251.70.zip Regards, Martin PS: Maybe r-sig-finance is more appropriate for questions concerning Rmetrics. Ola Lindqvist wrote: Dear R users, I have problem with the library fCalendar. I am not using the US standard format notations. It seems like it is not possible to have different format than the US standards. Anyone how knows a way to go around this problem? Here is the code I enter: myDate = 12.03.2005 timeDate(myDate, format = %d.%m.%Y) And I get following error message: Error in if (sum(lt$sec + lt$min + lt$hour) == 0) isoFormat = %Y-%m-%d : missing value where TRUE/FALSE needed Thanks, Ola __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. OL __ OL R-help@stat.math.ethz.ch mailing list OL https://stat.ethz.ch/mailman/listinfo/r-help OL PLEASE do read the posting guide http://www.R-project.org/posting-guide.html OL and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Any parser generator / code assistance for R?
A- == Ali - [EMAIL PROTECTED] on Sat, 18 Aug 2007 00:40:52 +0100 writes: A- Hi, A- Is there any parser generator like www.antlr.org? Moreover, how does simple A- code assistance work currently in R? By 'simple code assistance' I meant A- things like: A- Object$MTAB -- Object$Method If you really meant a list with components or an S4 object with slots, such code completion works at least since R 2.5.1, because of the recent 'rcompletion' extensions of Deepayan Sarkar, and of course in ESS (Emacs Speaks Statistics), and I think in several other GUI/Environments as well. But if you are thinking OOP as in Java or C++ (and I think you *are* thinking along that way), then rather learn that S (and hence R) do OOP in a function-centric rather than class-centric way; something which seems to be quite hard to grasp for many who have been brought up in Java-like schools If you are still interested in R, look out for documents with S4 (or formal methods and classes) and R in the title ;-) Regards, Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function to find coodinates in an array
GaGr == Gabor Grothendieck [EMAIL PROTECTED] on Thu, 16 Aug 2007 23:46:28 -0400 writes: GaGr Get the indices using expand.grid and then reorder GaGr them: set.seed(1); X - array(rnorm(24), 2:4) # input GaGr X # look at X GaGr do.call(expand.grid, sapply(dim(X), seq))[order(X),] Excellent, Gabor! Definitely the nicest of the solutions so far! GaGr On 8/16/07, Ana Conesa [EMAIL PROTECTED] wrote: Dear list, I am looking for a function/way to get the array coordinates of given elements in an array. What I mean is the following: - Let X be a 3D array - I find the ordering of the elements of X by ord - order(X) (this returns me a vector) - I now want to find the x,y,z coordinates of each element of ord Can anyone help me? Thanks! Ana __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matching elements from two vectors
GF == Gonçalo Ferraz [EMAIL PROTECTED] on Fri, 17 Aug 2007 11:03:09 -0400 writes: GF Thanks much. x %in% y works. yes, indeed. But I wonder a bit why nobody remarked on the facts that 1) you use match in the subject of your e-mail 2) match() is the basic ingredient you use: '%in%' is trivially defined via match(). Type `%in%` in your R session. Regards, Martin GF On 8/17/07, jim holtman [EMAIL PROTECTED] wrote: Also if you want all the matches x[x %in% y] [1] 2 3 3 3 On 8/17/07, Gon�alo Ferraz [EMAIL PROTECTED] wrote: Hi, Imagine a vector x with elements (1,2,1,1,3,5,3,3,1) and a vector y with elements (2,3). I need to find out what elements of x match any of the elements of y. Is there a simple command that will return a vector with elements (F,T,F,F,T,F,T,T,F). Ideally, I would like a solution that works with dataframe colums as well. I have tried x==y and it doesn't work. x==any(y) doesn't work either. I realize I could write a foor loop and go through each element of y asking if it matches any element of x, but isn't there a shorter way? Thanks, Gon�alo [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? GF [[alternative HTML version deleted]] GF __ GF R-help@stat.math.ethz.ch mailing list GF https://stat.ethz.ch/mailman/listinfo/r-help GF PLEASE do read the posting guide http://www.R-project.org/posting-guide.html GF and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ung�ltige Versionsspezifikation
JK == John Kane [EMAIL PROTECTED] on Wed, 15 Aug 2007 08:55:59 -0400 (EDT) writes: JK I think we need more information about your system. JK Please run JK sessionInfo() JK and include the information in another posting. Yes, indeed. However R version 2.3.1 seems a bit too old to fit to a current version of cairo (rather 'cairoDevice' ??). And BTW: It is a *package*, not a library!!! Martin Maechler, ETH Zurich JK --- Mag. Ferri Leberl [EMAIL PROTECTED] wrote: Dear everybody, excuse me if this question ist trivial, however, I have now looked for an answer for quite a while and therefore dare placing it here. I want to export .svg-files and got here the advice to employ the cairo-library. I downloaded the *current*-version here and expanded it to /usr/local/lib/R/site-library. library(cairo) returned ungültige Versionsspezifikation (INVALID VERSION SPECIFICATION). I got some answer here (EPM39), but, stupid enough, I could not employ it as I don't know where to look for the variable named there. The R-version I employ is 2.3.1. Can anybody solve the (probably simple) problem? Thank you in advance. Yours, Ferri JK JK __ JK R-help@stat.math.ethz.ch mailing list JK https://stat.ethz.ch/mailman/listinfo/r-help JK PLEASE do read the posting guide http://www.R-project.org/posting-guide.html JK and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixture of Normals with Large Data
BertG == Bert Gunter [EMAIL PROTECTED] on Tue, 7 Aug 2007 16:18:18 -0700 writes: TV Have you considered the situation of wanting to TV characterize probability densities of prevalence TV estimates based on a complex random sample of some TV large population. BertG No -- and I stand by my statement. The empirical BertG distribution of the data themselves are the best BertG characterization of the density. You and others are BertG free to disagree. I do agree with you Bert. From a practical point of view however, you'd still want to use an approximation to the data ECDF, since the full ecdf is just too large an object to handle conveniently. One simple quite small and probably sufficient such approximation maybe using the equivalent of quantile(x, probs = (0:1000)/1000) which is pretty related to just working with a binned version of the original data; something others have proposed as well. Martin BertG On 8/7/07, Bert Gunter [EMAIL PROTECTED] BertG wrote: Why would anyone want to fit a mixture of normals with 110 million observations?? Any questions about the distribution that you would care to ask can be answered directly from the data. Of course, any test of BertG normality (or anything else) would be rejected. More to the point, the data are certainly not a random sample of anything. There will be all kinds of systematic nonrandom structure in them. This is clearly a situation where the researcher needs to think more carefully BertG about the substantive questions of interest and how the data may shed light on them, instead of arbitrarily and perhaps reflexively throwing some silly statistical methodology at them. Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Victor Sent: Tuesday, August 07, 2007 3:02 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] Mixture of Normals with Large Data I wasn't aware of this literature, thanks for the references. On 8/5/07, RAVI VARADHAN [EMAIL PROTECTED] wrote: Another possibility is to use data squashing methods. Relevant papers are: (1) DuMouchel et al. (1999), (2) Madigan et al. (2002), and (3) Owen (1999). Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: [EMAIL PROTECTED] - Original Message - From: Charles C. Berry [EMAIL PROTECTED] Date: Saturday, August 4, 2007 8:01 pm Subject: Re: [R] Mixture of Normals with Large Data To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch On Sat, 4 Aug 2007, Tim Victor wrote: All: I am trying to fit a mixture of 2 normals with 110 million observations. Iam running R 2.5.1 on a box with 1gb RAM running 32-bit windows and I continue to run out of memory. Does anyone have any suggestions. If the first few million observations can be regarded as a SRS of the rest, then just use them. Or read in blocks of a convenient size and sample some observations from each block. You can repeat this process a few times to see if the results are sufficiently accurate. Otherwise, read in blocks of a convenient size (perhaps 1 million observations at a time), quantize the data to a manageable number of intervals - maybe a few thousand - and tabulate it. Add the counts over all the blocks. Then use mle() to fit a multinomial likelihood whose probabilities are the masses associated with each bin under a mixture of normals law. Chuck Thanks so much, Tim [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guideand provide commented, minimal, self-contained, reproducible code. Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E UC San Diego La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list
Re: [R] deriv for psigamma
UweL == Uwe Ligges [EMAIL PROTECTED] on Tue, 31 Jul 2007 12:13:45 +0200 writes: UweL francogrex wrote: Hi, 2 questions: [] Question 2: deriv(~gamma(x),x) expression({ .expr1 - gamma(x) .value - .expr1 .grad - array(0, c(length(.value), 1), list(NULL, c(x))) .grad[, x] - .expr1 * psigamma(x) attr(.value, gradient) - .grad .value }) BUT deriv3(~gamma(x),x) Error in deriv3.formula(~gamma(x), x) : Function 'psigamma' is not in the derivatives table What I want is the expression for the second derivative (which I believe is trigamma(x), or psigamma(x,1)), how can I obtain that? UweL By using some algebraic software (rather than a numeric one) or UweL contributing complete derivatives tables for the next R release. Yes, but for the present case, one could argue that the R internal code which knows d/dx lgamma(x) = psi(x) = digamma(x) = psigamma(x,0) should easily be enhanced to also know d/dx psigamma(x, n) = psigamma(x, n+1) and consequently (but maybe with an extra clause) d/dx psigamma(x) = psigamma(x, 1) The code is in R*/src/main/deriv.c and patches which implement the above and (there are few more 'FIXME's there ... ;-) against https://svn.r-project.org/R/trunk/src/main/deriv.c are welcome - after useR!2007 Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Slightly OT - use of R
BDR == Prof Brian Ripley [EMAIL PROTECTED] on Mon, 30 Jul 2007 11:13:47 +0100 (BST) writes: BDR On Mon, 30 Jul 2007, [EMAIL PROTECTED] wrote: On 30-Jul-07 08:28:15, John Logsdon wrote: I am trying to get a measure of how R compares in usage as a statistical platform compared to other software. I would guess it is the most widely used among statisticians at least by virtue of it being open source. BDR I don't think that is the main reason. Most of the R users I know BDR migrated from commercial statistical software for reasons other than cost. BDR (Cross-platform availability has been one major reason.) much of this is true here (Switzerland) as well. {And some have *not* migrated because R is Free Software, but that's really another story} Note however that the (non-PhD-graduate) students we teach here would not be urged to using R if it was not the combination of its quality and its Free Software state. And I have had several acquaintances who have only started using R because they could get it so easily and quickly, and they have changed to using R as their main computational/statistical software tool. But is there any study to which I can refer? By asking this list I am not exactly adopting a rigorous approach! I don't know about that -- my own expectation would be that serious users of R are likely to be subscribers to the list. So maybe a good answer to your question would be the number of subscribers (which I'm sure Martin Maechler can find out). Of course, some people will have subscribed under more than one email address, so that would somewhat over-estimate the number of people who subscribe. But it can be traded off (to a somewhat unknown extent) against R users who do not subscribe. BDR I think it would be a seriously biased estimate. BDR Few of our hundreds of student users will be subscribed to R-help BDR (since their first port of call for help is local). BDR Also, we get quite a lot of postings via the gmane and nabble gateways. Yes, yes, yes. The exact same situation here and I'd believe in many places. And the problem with the bias ('factor' rather than 'offset' I'd say) is that it has been changing over time - I'd guess increasing pretty dramatically. My very wild subjective guess would be that #{statisticians seriously using R} / #{R-help subscribers} = = N_t / n_t is nowadays well over 20, maybe even over 100, of course depending on the definition of the numerator N_t. I could construct a very accurate time-series for n_t, but since I agree with Brian, I haven't done so for several years. Note that n_{t = 2007-07-30, 07:00} = 5559 More to the point, though, is what you mean by usage. If you simply mean people who use, that's a matter of counting (one way or another). But there's use and use. BDR Indeed. amen - Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] options(OutDec) etc {was Strange warning in summary.lm}
Without going into your details, I think options() should NEVER influence what as.numeric() does (which I think you are indirectly suggesting it should). In my eyes, using a decimal comma instead of decimal point in scientific computing is an abomination in itself. Providing an option for *output* is one thing, but having it influence basic R engine functions like as.numeric(), format(), ... is an absolute NO!_NO! for me. So I am strongly opposed to an 'InDec' option as was mentioned earlier in this thread. We could consider adding a 'dec' (or 'decimal.sep') *argument* to some R functions, but such an argument must default to . rather than yet another option. R should remain as *functional* as possible. == options() should be used sparingly. They should only influence printed output at most, not computations per se. Of course I know that this is not strictly possible since the computations can use textConnection() etc.. Martin Maechler, ETH Zurich OT == ONKELINX, Thierry [EMAIL PROTECTED] on Wed, 25 Jul 2007 11:04:43 +0200 writes: OT Dear Peter, Uwe and Brian, OT I've found some more problems with options(OutDec = ,). OT 1) as.numeric yields NA where it shouldn't z - c(12, 12,34, 12.34) options(OutDec = ,) as.numeric(z) OT [1] 12,00NA 12,34 OT Warning message: OT NAs introduced by coercion in: as.double.default(z) OT # should result in c(12, 12.34, NA) options(OutDec = .) as.numeric(z) OT [1] 12.00NA 12.34 OT Warning message: OT NAs introduced by coercion in: as.double.default(z) OT 2) anova yields the same warning as summary x - runif(100) y - rnorm(100) options(OutDec = ,) summary(lm(y~x)) OT Call: OT lm(formula = y ~ x) OT Residuals: OT Min 1Q Median 3Q Max OT -2,81744 -0,61680 0,02107 0,66309 2,20599 OT Coefficients: OT Estimate Std. Error t value Pr(|t|) OT (Intercept) -0,073531 0,195880 -0,3750,708 OT x0,007519 0,318159 0,0240,981 OT Residual standard error: 0,9795 on 98 degrees of freedom OT Multiple R-Squared: 5.699e-06, Adjusted R-squared: -0.0102 OT F-statistic: 0.0005585 on 1 and 98 DF, p-value: 0,9812 OT Warning message: OT NAs introduced by coercion in: as.double.default(Cf[okP]) anova(lm(y~x)) OT Analysis of Variance Table OT Response: y OT Df Sum Sq Mean Sq F value Pr(F) OT x 1 0,001 0,001 6e-04 0,9812 OT Residuals 98 94,031 0,960 OT Warning message: OT NAs introduced by coercion in: as.double.default(Cf[okP]) OT Cheers, OT Thierry OT OT ir. Thierry Onkelinx OT Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest OT Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance OT Gaverstraat 4 OT 9500 Geraardsbergen OT Belgium OT tel. + 32 54/436 185 OT [EMAIL PROTECTED] OT www.inbo.be OT Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt OT A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens ONKELINX, Thierry Verzonden: donderdag 19 juli 2007 13:56 Aan: Peter Dalgaard CC: r-help@stat.math.ethz.ch; Uwe Ligges Onderwerp: Re: [R] Strange warning in summary.lm Dear Peter, Here's an example. Notice the warning in the last two lines of the summary with options(OutDec = ,). It's not present with options(OutDec = .). Cheers, Thierry x - runif(100) y - rnorm(100) options(OutDec = ,) summary(lm(y~x)) Call: lm(formula = y ~ x) Residuals: Min1QMedian3Q Max -2,389749 -0,607002 0,006969 0,689535 1,713197 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0,033970,17774 0,1910,849 x -0,092190,29518 -0,3120,755 Residual standard error: 0,868 on 98 degrees of freedom Multiple R-Squared: 0.0009943, Adjusted R-squared: -0.0092 F-statistic: 0.09754 on 1 and 98 DF, p-value: 0,7555 Warning message: NAs introduced by coercion in: as.double.default(Cf[okP]) options(OutDec = .) summary(lm(y~x)) Call: lm(formula = y ~ x) Residuals: Min1QMedian3Q Max -2.389749 -0.607002 0.006969 0.689535 1.713197 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.03397
Re: [R] Computing the rank of a matrix.
RV == RAVI VARADHAN [EMAIL PROTECTED] on Sat, 07 Apr 2007 18:39:36 -0400 writes: this is a bit a late reply... better late than never RV Hi Martin, Hi Ravi, thanks for your research. RV I played a bit with rankMat. I will first state the RV conclusions of my numerical experiments and then present RV the results to support them: RV 1. I don't believe that rankMat, or equivalently RV Matlab's rank computation, is necessarily better than RV qr(.)$rank or (qr., LAPACK=TRUE)$rank. In fact, for the RV notorious Hilbert matrix, rankMat can give poor RV estimates of rank. RV 2. There exists no universally optimal (i.e. optimal RV for all A) tol in qr(A, tol)$rank that would be RV optimally close to rankMat. The discrepancy in the RV ranks computed by qr(A)$rank and rankMat(A) obviously RV depends on the matrix A. This is evident from the tol RV used in rankMat: RV tol - max(d) * .Machine$double.eps * abs(singValA[1]) RV So, this value of tol in qr will also minimize the discrepancy. RV 3. The tol parameter is irrelevant in qr(A, tol, RV LAPACK=TRUE)$rank, i.e. LAPACK doesn't seem to utilize RV the tol parameter when computing the rank. However, RV qr(A, tol, LAPACK=FALSE)$rank does depend on tol. Yes, indeed! The help page of qr() actually says so {at least now, don't know about 3 months ago}. RV Now, here are the results: RV 1. hilbert.rank - matrix(NA,20,3) hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) } for (i in 1:20) { RV + himat - hilbert(i) RV + hilbert.rank[i,1] - rankMat(himat) RV + hilbert.rank[i,2] - qr(himat,tol=1.0e-14)$rank RV + hilbert.rank[i,3] - qr(himat, tol = .Machine$double.eps, LAPACK = TRUE)$rank RV + } hilbert.rank RV [,1] [,2] [,3] RV [1,]111 RV [2,]222 RV [3,]333 RV [4,]444 RV [5,]555 RV [6,]666 RV [7,]777 RV [8,]888 RV [9,]999 RV [10,] 10 10 10 RV [11,] 10 11 11 RV [12,] 11 12 12 RV [13,] 11 12 13 RV [14,] 11 13 14 RV [15,] 12 13 15 RV [16,] 12 15 16 RV [17,] 12 16 17 RV [18,] 12 16 18 RV [19,] 13 17 19 RV [20,] 13 17 20 RV We see that rankMat underestimates the rank for n 10, but qr(himat, tol = .Machine$double.eps, LAPACK=TRUE)$rank gets it right. Yes, indeed; and that's seems a bit against the ``common knowledge'' that svd() is more reliable than qr() Hmm I'm still baffled a bit.. Note that with the Hilbert matrices, one might argue that hilbert(20) might really not have a correct estimated rank of 20, but at least for hilbert(13) or so, the rank should be == n. BTW, there's a nice plot, slightly related to this problem, using rcond() from the Matrix package : library(Matrix) hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) } rcHilb - sapply(1:20, function(n) rcond(Matrix(hilbert(n plot(rcHilb, xlab = n, log = y, col = 2, type = b, main = reciprocal condition numbers of Hilbert(n)) where one sees that the LAPACK code that underlies Matrix::rcond() also gets into problem when estimating the condition number for hilbert(n) when n ~ 16 . I think if we wanted to make real progress here, we'd have to consult with numerical analyist specialists. But for me the topic is too remote to be followed up further at the moment. One conclusion for me is that to estimate the rank of a matrix in current versions of R, one should use rankMat - function(x) qr(x, LAPACK = TRUE)$rank (as was suggested as one possibility in the original thread) Regards, and thank you again, Ravi. Martin Maechler, ETH Zurich RV 2. RV Here I first, created a random rectangular matrix, and then added a number of rows to it, where these new rows are the same as some of the original rows except for a tiny amount of noise, which I call eps. So, the degree of rank deficiency is controlled by eps. I let eps take on 3 values: 1.e-07, 1.e-10, and 1.e-14, and show that the optimal tol (in terms of being close to rankMat) depends on the level of precision (eps) in the matrix. set.seed(123) nrow - 15 ncol - 20 nsim - 5000 ndefic - 4 # number of nearly dependent rows eps - 1.e-07 rnk - matrix(NA, nsim, 5) for (j in 1:nsim) { RV + A - matrix(rnorm(nrow*ncol),nrow, ncol) RV + newrows - matrix(NA, ndefic, ncol) RV + for (i in 1:ndefic) { RV + newrows[i,] - A[nrow-i,] + rnorm(ncol, sd=min(abs(A[nrow-i+1,]))* eps) RV + } RV + RV + B - rbind(A,newrows) RV + rnk[j,1] - rankMat(B) RV + rnk[j,2] - qr(B, tol = 1.e-07)$rank RV + rnk[j,3] - qr(B, tol = 1.e-11)$rank RV + rnk[j,4] - qr(B, tol = 1.0e-14)$rank RV + rnk[j,5] - qr(B
Re: [R] Package design, placement of legacy functions
WA == William Asquith [EMAIL PROTECTED] on Sun, 22 Jul 2007 13:29:24 -0500 writes: WA I have a function XOLD() from a nearly verbatim port of legacy WA FORTRAN in a package. I have remplemented this function as XNEW() WA using much cleaner native R and built-in functions of R. I have WA switched the package to the XNEW(), but for historical reasons would WA like to retain the XOLD() somewhere in the package directory WA structure. An assertion through a README or other will point to this WA historical function and the output from the two should be numerically WA equal. WA Placement in package/R is not an option as XOLD() no longer WA constitutes a true user level function, would package/inst/legacyR or WA something like that be suitable to the R community? Yes, put it somewhere inside pkg/inst/ (and end the filename in *.R). A user of your **installed** package will be able to use system.file(legacy.R, package = pkg) e.g., as source(system.file(legacy.R, package = pkg)) WA Thanks for the guidance. . . you're welcome. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nearest correlation to polychoric
RV == Ravi Varadhan [EMAIL PROTECTED] on Fri, 13 Jul 2007 10:52:36 -0400 writes: RV Martin, I sent you the Matlab code for this which I had RV obtained from Nich Higham. RV Cheng, Sheung Hun and Higham, Nick (1998) A Modified RV Cholesky Algorithm Based on a Symmetric Indefinite RV Factorization; \emph{SIAM J. Matrix Anal.\ Appl.}, RV \bold{19}, 1097--1110. RV Do you remember? I now do .. last November. I'm embarrassed to admit that I hadn't found the time at the time and had completely forgotten about it. But now that Jens already has a version of this in R -- thanks Jens -- this should really find a home in the R universe As a matter of fact, I was thinking of porting the algorithm to the 'Matrix' package eventually, the specific polychor related case may still better fit to John Fox' package. Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extending Matrix class
This is from a private question which I'm given permission to answer in public: IF == Ingo Feinerer [EMAIL PROTECTED] on Fri, 13 Jul 2007 16:14:07 +0200 writes: IF Hello, We tried to derive a class from Matrix but had IF some problems. Maybe you can help us: library(Matrix) m - Matrix(c(1:3,rep(0,12),16:20), nrow = 4, ncol = 5) setClass(TermDocMatrix, representation(Weighting = character), contains = (Matrix)) IF Now we want to do something like: IF new(TermDocMatrix, .Data = m, weighting = foobar) IF which obviously does not work due to the missing .Data IF slot. yes, obviously, indeed. There is never any .Data slot in our matrices. IF Note that we do not know in advance what the IF matrix m actually is (we only know it is *some* IF Matrix, e.g., we do not know if it is a dgCMatrix or a IF lgCMatrix or ...). Is there a (simple) solution? Well, yes, but probably not the one you had wanted: setClass(TD_Matrix, representation(data = Matrix, Weighting = character)) A - spMatrix(10,20, i = c(1,3:8), j = c(2,9,6:10), x = 7 * (1:7)) tdr - new(TD_Matrix, data = A, Weighting = foobar) tdr Now I understand that and why you had wanted to do this the original way you did - which cannot work AFAICS. OTOH, I wonder if other useRs, particularly those who know about S4 classes (and the Matrix classes), have better proposals maybe along the following dream .. I think what Ingo would want is to say: let me extend the full Matrix class hierarchy (or just the dsparseMatrix sub hierarchy) by a new slot 'Weighting' and hence by default inherit all methods which I don't explicitly set myself. I think this can only partially work, even for a future version of R, since the inherited methods don't know what to do with Weighting, but it would already be interesting if all necessary new methods could be defined semi-automatically: 1) call the corresponding Matrix method 2) pass on all my extra slots Regards, Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create a matrix from an excel file?
TS == Tavpritesh Sethi [EMAIL PROTECTED] on Sat, 14 Jul 2007 12:59:03 +0530 writes: TS how do you create a matrix from an excel file read into TS R by the command read.delim? data.matrix(.) {is typical a bit more useful than as.matrix(.) } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nearest correlation to polychoric
DR == Dimitris Rizopoulos [EMAIL PROTECTED] on Fri, 13 Jul 2007 14:43:08 +0200 writes: DR you could also have a look at function posdefify() from DR package `sfsmisc'. DR I hope it helps. Yes, thanks, Dimitris; note that my posdefify() function uses a pretty arbitrary fudge value for posdefification, namely eps.ev = 1e-07 As a matter of fact, earlier this week (in the first international R/Rmetrics workshop), I've talked to people from finance who also need that (or something better?). Jordi Molins Coronado (Madrid) drew my attention to an idea he found in the book (English re-edition of French as of 1996) Jean-Philippe Bouchaud (2000) Theory of Financial Risk and Derivative Pricing: From Statistical Physics to Risk Management which supposedly uses theory of random matrices and the entailing distribution of random eigenvalues in order to find a more sensible cutoff than my 'eps.ev' default of 1e-7. Unfortunately that book is checked out from our library and I can't have a look. Googling and Wikipedia seem to indicate to me that most of the random matrix theory does not directly apply here, since I'm really interested in the spectrum of X'X where X is a de-meaned n x p random matrix. OTOH, help(posdefify) already mentions more sophisticated approaches to the problem, the one I (as I vaguely remember) should be made available being Cheng, Sheung Hun and Higham, Nick (1998) A Modified Cholesky Algorithm Based on a Symmetric Indefinite Factorization; \emph{SIAM J. Matrix Anal.\ Appl.}, \bold{19}, 1097--1110. Jens, could you make your code (mentioned below) available to the community, or even donate to be included as a new method of posdefify() ? Regards, Martin Maechler, ETH Zurich DR Best, Dimitris DR Dimitris Rizopoulos Ph.D. Student Biostatistical DR Centre School of Public Health Catholic University of DR Leuven DR Address: Kapucijnenvoer 35, Leuven, Belgium Tel: DR +32/(0)16/336899 Fax: +32/(0)16/337015 Web: DR http://med.kuleuven.be/biostat/ DR http://www.student.kuleuven.be/~m0390867/dimitris.htm DR - Original Message - From: Jens Oehlschlägel DR [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: DR Friday, July 13, 2007 2:25 PM Subject: [R] nearest DR correlation to polychoric DR Dear all, DR Has someone implemented in R (or any other language) DR Knol DL, ten Berge JMF. Least-squares approximation of DR an improper correlation matrix by a proper one. DR Psychometrika, 1989, 54, 53-61. DR or any other similar algorithm? DR Best regards DR Jens Oehlschlägel DR Background: DR I want to factanal() matrices of polychoric correlations DR which have negative eigenvalue. I coded DR Highham 2002 Computing the nearest correlation matrix - DR a problem from finance, IMA Journal of Numerical DR Analysis (2002), 22, 329-343. DR which basically works but still leaves very small DR negative eigenvalues which causes factanal() to fail DR with factanal(cov=ncor$cor, factors=2) DR Fehler in optim(start, FAfn, FAgr, method = L-BFGS-B, DR lower = lower, : L-BFGS-B benötigt endliche Werte von DR 'fn' Zusätzlich: Warning message: NaNs wurden erzeugt DR in: log(x) version DR_ platform i386-pc-mingw32 arch i386 os DR mingw32 system i386, mingw32 status major 2 minor 4.1 DR year 2006 month 12 day 18 svn rev 40228 language R DR version.string R version 2.4.1 (2006-12-18) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem/bug with smooth.spline and all.knots=T
H == Hubertus [EMAIL PROTECTED] on Wed, 4 Jul 2007 14:58:44 +0200 writes: H Dear list, H if I do H smooth.spline(tmpSec, tmpT, all.knots=T) H with the attached data, Thanks for providing the data via URL (see below) H with the attached data, I get this error-message: H Error in smooth.spline(tmpSec, tmpT, all.knots = T) : H smoothing parameter value too small H If I do H smooth.spline(tmpSec[-single arbitrary number], tmpT[-single arbitrary number], all.knots=T) H it works! not quite (see below) I have transformed your reports into something reproducible (as *all* such R-help messages should be!): dFile - /sspl_Hub_data.rda ## use a file name you can write to if(file.exists(dFile)) { load(dFile) } else { load(url(http://phi.stonemonkey.org/smoothspline.rda;)) d.tmp - data.frame(Sec = tmpSec, T = tmpT) save(d.tmp, file = dFile) } str(d.tmp) ## 'data.frame':2055 obs. of 2 variables: ## $ Sec: num 25743 25744 25745 25746 25747 ... ## $ T : num 288 288 288 288 288 ... ## Dear list, ## if I do ssT - with(d.tmp, smooth.spline(Sec, T, all.knots=TRUE)) ## with the attached data, I get this error-message: ## Error in smooth.spline(tmpSec, tmpT, all.knots = T) : ## smoothing parameter value too small ## MM: it works fine for me on 64-bit, RH5 (lynne), ## I but can confirm the problem on my 32-bit Pentium M notebook: ssT - with(d.tmp, smooth.spline(Sec, T, all.knots=TRUE, control.spar = list(trace = TRUE))) ## gives (on nb-mm): ## sbart (ratio = 1.1399039e-11) iterations; initial tol1 = 3.334042e-05 : ##sparGCV b - a e Kind NEW lspar crit ## --- ## -0.35410197 0.000220934951 3.e+00 0 GS -- 1.61033e-11 0.00027127 ## -0.35410197 0.000220934951 1.8541e+00 1.8541 FP GS 8.47468e-20 1.57859e-05 ## -0.79179607 1.57858803e-05 1.1459e+00 -1.1459 FP GS 9.41382e-22 3.555e-12 ## -1.06230590 3.55500244e-12 7.0820e-01 -0.7082 FP PI 3.86481e-21 2.08386e-10 ## -1.06230590 3.55500244e-12 5.2259e-01-0.27051 FP PI 1.9073e-21 1.13107e-09 ## -1.06230590 3.55500244e-12 4.8014e-010.084898 FP GS 5.83319e-23 1.04372e-18 ## -1.22949017 1.04371872e-18 4.3769e-01-0.43769 FP PI 2.30006e-22 4.72251e-15 ## -1.22949017 1.04371872e-18 3.5298e-01-0.16718 FP PI 1.1561e-22 2.37808e-16 ## -1.22949017 1.04371872e-18 3.1163e-010.082471 FP PI 7.90222e-23 2.03459e-15 ## -1.22949017 1.04371872e-18 2.8876e-010.041121 FP GS 1.0457e-23 2.03459e-15 ## 1.0457e-23 1.04372e-18 ## Error in smooth.spline(Sec, T, all.knots = TRUE, control.spar = list(trace = 2)) : ## smoothing parameter value too small ## Where it works, I get ssT ## Call: ## smooth.spline(x = Sec, y = T, all.knots = TRUE) ## Smoothing Parameter spar= -1.044387 lambda= 1.266990e-21 (22 iterations) ## Equivalent Degrees of Freedom (Df): 2054.954 ## Penalized Criterion: 4.420969e-20 ## OTOH, need a relaxed convergence criterion: ssT. - with(d.tmp, smooth.spline(Sec, T, all.knots=TRUE, control.spar = list(trace = TRUE, tol=1e-4, eps= 0.01))) ssT. ##- Df: 2055.163 --- interpolation ##-- really almost fails to do spline *interpolation* as a ##limiting case. In principle that's well know: the smoothing spline solution tends to an interpolation spline when lambda - 0, however the formula (algorithm) that is used for smoothing spline fitting( given lambda) is not applicable to the limiting case, lambda=0. I agree that this is a small remaining flaw of an already somewhat sophisticated algorithm for determining the smoothing parameter. I'm sure we could fix the algorithm to work in this case, however to do the fix in such a way that all other cases remain returning the exact same solution as now, seems a harder task. And changing the default behavior of smooth.spline() even if only slightly, is quite a bit problematic. I know it, because I did it many many R versions back... For you data, see many possibilities; - don't use all.knots = TRUE - really use spline interpolation, library(splines) ?interpSpline - use 'df = something reasonable' if you want to apply it to many datasets - don't use 'all.knots = TRUE' or use 'cv=TRUE' or ... : ## What about proper crossvalidation instead of GCV (default) ## which I'd recommend anyway on todays fast computers : ssTg - with(d.tmp, smooth.spline(Sec, T, all.knots=TRUE, cv= TRUE, control.spar = list(trace = TRUE))) ## *does* work (interestingly) ssTg # hmm, also DF = 2055 plot(T ~ Sec, data = d.tmp, pch = .) lines(predict(ssTg), col=2) ## don't use all.knots: ssTg. - with(d.tmp, smooth.spline(Sec, T, cv= TRUE, control.spar =
Re: [R] align() function missing in R ?
Hi Markus, You can't assume that a typical R users knows much about S+. R has been beyond S+ for a long time {{ :-) :-) please Insightful staff, don't start to jump at me !}} Even I, as a very long time S and Splus user (of the past: 1987--~1997), have never, I think, used align(). Can you give *reproducible examples* of what align() does for you? Then, kind R users will typically show you simple ways to achieve the same. Also: R is Free Software (i.e. open source and more), so we'd be happy to accept offers of an align() function that behaved compatibly (``or better'') than the S-plus one. Note however that you'd typically not be allowed to copy the S-plus implementation. Martin ML == Markus Loecher [EMAIL PROTECTED] on Thu, 28 Jun 2007 11:10:51 -0400 writes: ML Dear list members, I switched from Splus to R a few ML years ago and so far found no functionality missing. ML However, I am struggling to find the equivalent align() ML function for time series. I did find some reduced ML functionality such as alignDailySeries in ML package:fCalendar but the full capability of aligning ML two timeseries seems to be missing. Could this be true ML ? I am sure there must be a need for this useful ML function. Any help would be greatly appreciated. ML Thanks ! ML Markus __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Boxplot issues
SE == S Ellison [EMAIL PROTECTED] on Fri, 22 Jun 2007 13:02:20 +0100 writes: SE Boxplot and bxp seem to have changed behaviour a bit of late (R 2.4.1). Or maybe I am mis-remembering. SE An annoying feature is that while at=3:6 will work, there is no way of overriding the default xlim of 0.5 to n+0.5. That prevents plotting boxes on, for example, interval scales - a useful thing to do at times. I really can see no good reason for bxp to hard-core the xlim=c(0.5, n+0.5) in the function body; it should be a parameter default conditional on horizontal=, not hard coded. SE Also, boxplot does not drop empty groups. I'm sure it used to. I know it is good to be able to see where a factor level is unpopulated, but its a nuisance with fractional factorials and some nested or survey problems when many are known to be missing and are of no interest. Irrespective of whether my memory is correct, the option would be useful. How hard can it be to add a 'drop.empty=F' default to boxplot to allow it to switch? SE Obviously, these are things I can fix locally. But who 'owns' boxplot so I can provide suggested code to them for later releases? Legally speaking, I think that's a hard question the answer of which may even depend on the country where it is answered. I would like to say it is owned by the R Foundation. Suggested improvements of the R base code should be made and discussed on the R-devel mailing list. That's exactly the main purpose of that list. Such propositions typically make it into the code base if you are convincing and you provide code improvements that convince at least one member of R core that it's worth his time to implement, document, *and* test the changes. Also, as on R-help, it helps to work with small reproducible (ideally cut-n-pastable) R code examples. Regards, Martin Maechler SE Steve Ellison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix *package*, CHOLMOD error: problem too large
[Jose, if you call the Matrix *package* library once more, ... GR! ..] DM == Duncan Murdoch [EMAIL PROTECTED] on Fri, 22 Jun 2007 14:04:03 -0400 writes: DM On 6/22/2007 1:26 PM, Jose Quesada wrote: I have a pretty large sparse matrix of integers: dim(tasa) [1] 91650 37651 I need to add one to it in order to take logs, but I'm getting the following error: tasa = log(tasa + 1) CHOLMOD error: problem too large Error in asMethod(object) : Cholmod error `problem too large' I have 2 Gb of RAM, and the current workspace is barely 300mb. Is there any workaround to this? Anyone has any experience with this error? DM If tasa is sparse, then tasa+1 will not be sparse, so DM that's likely your problem. [of course] DM You might have better luck with DM log1p(tasa) {very good point, thank you, Duncan!} DM if the authors of the Matrix package have written a DM method for log1p(); if not, you'll probably have to do DM it yourself. They have not yet. Note however that this - and expm1() - would automagically work for sparse matrices if these two functions were part of the Math S4 group generic. I'd say that there's only historical reason for them *not* to be part of Math, and I am likely going to propose to change this Martin Maechler DM Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if within a function
HO == Hong Ooi [EMAIL PROTECTED] on Thu, 21 Jun 2007 15:49:42 +1000 writes: HO R doesn't use the 'functionname = result' idiom to return a value from a HO function. It looks like you're after: HO aaa - function(a) HO { HO if(a == 1) return(1) HO if(a != 1) return(2) HO } HO or HO aaa - function(a) HO { HO if(a == 1) 1 HO else 2 HO } HO see ?return or to continue the Variations on a theme : aaa - function(a) if(a == 1) 1 else 2 (You don't need { .. } : some people argue you should always use them for defensive programming where I would not use them in simple one liners, but would use them otherwise ) Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Could not find lmer function in {Matrix} package
SB == Steve Brady [EMAIL PROTECTED] on Tue, 19 Jun 2007 11:59:15 -0400 writes: SB I am having trouble calling the lmer function in the {Matrix} SB package. I first installed and loaded {Matrix} as follows: install.packages(Matrix) library(Matrix) SB The package loaded successfully, however when I attempted to call SB lmer, I received the following message: SB Error: could not find function lmer SB I also tried: SB ?lmer SB which produced no search results. And who told you lmer() was in the Matrix package ? It's in the lme4 package, and --- conceptually has always been there -- Only for some maintenance convenience (C code shared between lme4 and Matrix) reasons, lmer() has actually been in the Matrix package for some time in the past, however you were always supposed to say require(lme4) or library(lme4) to get to lmer. Regards, Martin SB I have attempted these actions in both MacOSx R Versions 2.4.1 and SB 2.5.0. I have also tried this in Version 2.5.1. beta on a Windows SB machine. SB Thanks in advance for any suggestions. SB Steve __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix library error: should never happen; please report
Hi Jose, JQ == Jose Quesada [EMAIL PROTECTED] on Tue, 19 Jun 2007 21:12:53 +0200 writes: JQ Hi, I got the following error. Sorry but this time I JQ couldn't reproduce it with a simple chunk of code: .TM.repl.i.2col(): drop 'matrix' case ... Error in .nextMethod(x = x, i = i, j = j) : 'i' has no integer column number should never happen; please report In addition: Warning messages: 1: Ambiguous method selection for %*%, target ddiMatrix#dgCMatrix (the first of the signatures shown will be used) diagonalMatrix#CsparseMatrix ddenseMatrix#CsparseMatrix in: .findInheritedMethods(classes, fdef, mtable) JQ I got 4 other copies of the same warning. Will play JQ around a bit more... This is really strange. Yes, but - the Matrix library is the file Matrix.so or Matrix.dll which is part of the installed (aka binary) Matrix *package* Maybe you really need to read the result of fortune(package.*Maechler) # after installing package 'fortunes' - please report was not meant to say to report to R-help, but to the package maintainers, - since you cannot reproduce it yet, we cannot do much about it. It may be a bug in the Matrix package (and Jose has told me that he's using the latest released version 0.99875-2), but in theory it could even be your own mistake, namely by wrongly manipulating the slots of a Matrix object. Please try to produce an R script - even if not small -- with a reproducible example; [and then do report to [EMAIL PROTECTED] JQ Thanks -- Jose Quesada, PhD. Best regards, Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: Odp: outlying
[Note: CC'ing to R-SIG-robust, the Special Interest Group on using Robust Statistics in R ] PP == Petr PIKAL [EMAIL PROTECTED] on Tue, 19 Jun 2007 12:55:37 +0200 writes: PP [EMAIL PROTECTED] napsal dne 19.06.2007 PP 12:23:58: Hi It often depends on your attitude to limits for outlying observations. Boxplot has some identifying routine for selecting outlying points. Any procedure usually requires somebody to choose which observation is outlying and why. You can use e.g. all values which are beyond some threshold based on sd but that holds only if distribution is normal. yes, and that's never true for the alternative, i.e. for the case where there *are* outliers. set.seed(1) x-rnorm(x) PP Sorry, it shall be PP x - rnorm(1000) PP ul - mean(x) +3*sd(x) PP ll - mean(x) -3*sd(x) PP beyond - (xul) | ( x ll) PP PP x[beyond] PP [1] 3.810277 Regards Petr No, really, do NOT do the above! It only works with very few and relatively mild outliers. There are much more robust alternatives. I show them for the simple example x - c(1:10, 100) 1) As mentioned by Petr, use instead what boxplot() does, just type boxplot.stats and ``see what to do''. This gives Median +/- 1.5 * IQR : i.e., ## Boxplot's default rule str(bp.st - boxplot.stats(x)) bp.st$stats[ c(1,5) ] ## 1 10 2) Use the recommendations of Hampel (1985) @ARTICLE{HamF85, author = Hampel, F., title =The breakdown points of the mean combined with some rejection rules, journal = Technometrics, year = 1985, volume = 27, pages =95--107, } i.e. Median +/- 5 * MAD where MAD = is the *NON*-scaled MAD, ~= mad(*, constant=1) i.e., in R M - median(x) (FH.interval - M + c(-5, 5) * mad(x, center=M, const=1)) ## -9 21 3) or something slightly more efficient (under approximate normality of the non-outliers), e.g., based on MASS::rlm() : n - length(x) s.rm - summary(robm - MASS::rlm(x ~ 1)) s.rm (cc - coef(s.rm)) ## approximate robust degrees of freedom; this is a hack ## which could well be correct ## asymptotically {where the weights would be 0/1} : (df.resid - sum(robm$w) - robm$rank) (Tcrit - qt(0.995, df = df.resid)) ## Std.error of mean ~= sqrt(1/n Var(X_i)) = 1/sqrt(n) sqrt(Var(X_i)) cc[,1] + c(-1,1) * sqrt(n) * Tcrit * cc[,Std. Error] ## -6.391201 18.555177 --- Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Augment 'Matrix' matrices
SH == Scott Hyde [EMAIL PROTECTED] on Mon, 18 Jun 2007 16:59:00 -1000 (HST) writes: SH Martin, How does Matrix implement augmented matrices? I SH tried this and got the expected result: {Replying to R-help, since this question has come up several times } V=matrix(1,2,3) V=cbind(V,V) V SH [,1] [,2] [,3] [,4] [,5] [,6] SH [1,]111111 SH [2,]111111 SH But when I did it with Matrix instead I got: library(Matrix) V=Matrix(1,2,3) V=cbind(V,V) V SH V V SH [1,] ? ? cbind() and rbind() cannot work with S4 objects because their first formal argument is '...' [ and with S3 objects they only work insofar as S3 generics can work: i.e. they only work when the first argument is of the respective class, but fail, e.g. for cbind(1, object) when object is of a non-standard S3 class. ] In earlier versions of Matrix, there was a sophisticated hack that made cbind() and rbind() work. But because it was a hack, and some people called it horrible rather than sophisticated, we had to give it up. {well, the really compelling argument was an example of do.call(rbind, list of length 1000) which was *very* inefficient} Instead, cbind2() and rbind2() have been written a few R versions ago to be used as (S4) generic functions. -- help(cbind2) In 'Matrix', we also define cBind() and rBind() to be used as direct (n-argument) substitutes for cbind() or rbind(), respectively. Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Responding to a posting in the digest
Thanks a lot, Ted, for your comprehensive answer! [See one short note way below: ] TH == Ted Harding [EMAIL PROTECTED] on Thu, 14 Jun 2007 09:54:04 +0100 (BST) writes: TH On 14-Jun-07 07:26:26, Moshe Olshansky wrote: Is there a convenient way to respond to a particular posting which is a part of the digest? I mean something that will automatically quote the original message, subject, etc. Thank you! Moshe Olshansky [EMAIL PROTECTED] TH This will depend on two things. TH 1. Whether the mail software you use has the capability; TH 2. Whether the digest format would permit it anyway. TH Regarding (2), if you are receiving R-help in TH traditional digest format (all the messages, each with TH its principal headers, as one single long message-body), TH then the only way to respond to a particular message is TH to start to compose a new message and copy what you need TH from the digest. TH While I've never reveived R-help in digest format TH myself, according to Martin Maechler: TH http://finzi.psych.upenn.edu/R/Rhelp02a/archive/59429.html TH Please open the URL at the end of every message TH https://stat.ethz.ch/mailman/listinfo/r-help go to the TH bottom and log in -- clicking the [Unsubscribe or Edit TH Options] field. You need your mailing list password TH sooner or later. The one you get sent every 1st of the TH month; or you can have it sent to you again. TH Then you are in a page entitled R-help Membership TH Configuration for foo@bar Scroll down to the section TH Your R-help Subscription where the 3rd entry is TH entitled Get MIME or Plain Text Digests? and now you TH want MIME. TH In MIME digest format, each message with its own main TH headers is a separate MIME attachment, and suitable mail TH software can bring any message up on its own, You can TH then reply in the normal way. TH However (and here is where I'm ignorant as a result of TH never having received R-help as digest), your reply may TH not continue the thread -- since this depends on TH message-identifier headers being present which allow TH threading software to trace which messages are replies TH to which message. The JISCMAIL MIME digest for the TH AllStat mailing list only includes a Message-ID for the TH digest as a whole, i.e. the ID for the entire digest TH message. Message-IDs for the individual messages in the TH digest (as would be seen by people who received them TH singly) are absent: you only get the likes of TH Date: DoW, DD Mon HH:MM:SS TZ From: Sender TH (person who sent the message to the list) Subject: TH Subject of individual message MIME-Version: 1.0 TH Content-Type: text/plain; charset=iso-8859-1 TH Content-Transfer-Encoding: quoted-printable TH and no Message ID for the original message from TH Sender. So any reply to this component message is not TH identifiable as belonging to its thread. TH I don't know whether R-help's 'mailman' provides such TH headers (Martin??). Yes, it does (I've checked with a pseudo-user who receives r-help in digests in MIME format). So you can indeed do the following. In my limited experience, the main problem is the bad quality of people'e e-mail software which does not properly work with the (typically invisible) 'References:' and 'In-Reply-To:' headers which mailman indeed does preserve in its MIME-digests. TH If it does, then your reply could TH include an In-Reply-To: which identifies the TH thread. Otherwise it can't. TH As to (1), you will probably get several suggestions for TH suitable mail software. My own (see below) opens an TH AllStat digest in a window with attachment tags TH displayed, one for Tablf of Contents, one for each TH message. Clicking on one of these opens a new window TH with the message attached to that tag displayed, and now TH the usual reply/forward etc mail sunctions can be TH applied to that message. But it will reply only to the TH address given in the From: header (i.e. the original TH sender, as above), not to the AllStat list (so you have TH to enter that address by hand, if you want to reply to TH the list). TH In principle, mailer software could also identify the TH address of the list from which the digest has been sent, TH as well as the sender of the original message, so you TH could get the option to reply to either or both. But my TH XFMail does not, and only offers the original TH sender. Whether other mailer software can do this is for TH others to comment on! TH Hoping this helps, Ted. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org
Re: [R] Package update announcements
MHHS == Martin Henry H Stevens [EMAIL PROTECTED] on Mon, 11 Jun 2007 17:23:46 -0400 writes: MHHS Hi Folks, I was wondering what everyone thought about MHHS adding a sentence to each package update announcement MHHS that described what the package did. R extensions are MHHS so numerous that it is difficult to keep up with them. MHHS Would it be appropriate to ask package developers to MHHS add a brief sentence about what the package does, when MHHS they announce updates? Thanks a lot, Hank! I've been supporting your suggestion ever since I had created 'R-packages' ( = R package announcements mailing list). As moderator, I even occasionally rejected postings to R-packages exactly by requiring such short information preceding any notice of changes. I think that 99% of the package authors would also agree, but then we are all humans and hence sometimes get into peculiar world views such as there's my package, and there's R and then rest of the universe ;-) Martin Maechler, ETH Zurich MHHS I would benefit from such descriptions. MHHS Cheers, Hank MHHS Dr. Hank Stevens, Assistant Professor 338 Pearson Hall MHHS Botany Department Miami University Oxford, OH 45056 MHHS Office: (513) 529-4206 Lab: (513) 529-4262 FAX: (513) MHHS 529-4243 http://www.cas.muohio.edu/~stevenmh/ MHHS http://www.muohio.edu/ecology/ MHHS http://www.muohio.edu/botany/ MHHS E Pluribus Unum MHHS __ MHHS R-help@stat.math.ethz.ch mailing list MHHS https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do MHHS read the posting guide MHHS http://www.R-project.org/posting-guide.html and MHHS provide commented, minimal, self-contained, MHHS reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pnorm how to decide lower-tail true or false
CM == Carmen Meier [EMAIL PROTECTED] on Fri, 08 Jun 2007 19:31:49 +0200 writes: CM Hi to all, maybe the last question was not clear enough. CM I did not found any hints how to decide whether it CM should use lower.tail or not. As it is an extra CM R-feature ( written in CM http://finzi.psych.upenn.edu/R/Rhelp02a/archive/66250.html CM ) I do not find anything about it in any statistical CM books of me. Yes, most statistical books do not consider numerical accuracy which is the real issue here. Note that R is much more than a statistical package and hence to be appreciated properly needs much broader (applied) mathematical, statistical and computer science knowledge ;-) When p ~= 1, '1 - p' suffers from so called cancellation (Numerical analysis 101). If you already know that you will use q := 1 - p, rather compuate 'q' directly than first compute p, then 1-p, losing all accuracy. All of R's pfoo(..) functions have an argument 'lower.tail' which is TRUE by default, since after all, pfoo(x) = Prob_{foo}[X = x] measures the probability of the lower or left tail of the foo-distribution. foo = norm is just a special case. If you really want q = 1 - pfoo(x) = Prob_{foo}[X x] then you can get this directly via q - pfoo(x, lower.tail = FALSE, ) Simple example with R : pnorm(10) [1] 1 1 - pnorm(10) [1] 0 pnorm(10, lower.tail=FALSE) [1] 7.619853e-24 Regards, Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why is the R mailing list so hard to figure out?
VE == Vladimir Eremeev [EMAIL PROTECTED] on Tue, 5 Jun 2007 01:39:47 -0700 (PDT) writes: VE irishhacker wrote: Why does the R mailing list need such an unusual and customized user interface? Well, that's VERY MUCH a matter of point of view, and the view is most probably very highly dependent on your year of birth. For many old timers like me, e-mail is at least as natural as web browsing. I see that such an attitude becomes more and more outdated. VE There was a discussion of this some time ago on the VE list. I believe, RSiteSearch(r-help mailing list VE forum) or some other similar keywords will find it. VE irishhacker wrote: What's the best way to view and read discussions in this group for recent days? Can I view the postings for the current day via Google Groups? Well, one way is still *subscribing* to R-help and occasionally helping each other, instead of just perusing it as a resource ... GMANE was mentioned already. VE I use the Nabble interface VE (http://www.nabble.com/R-f13819.html), it seems more VE convenient to me, but it collects less lists than Gmane. Yes, but it appends its silly footer (aka advertizement) to all mails sent out, luring people to use it. I (as mailing list manager) have considered more than once to just evaporate such footers as I evaporate most other spam-footers (as e.g. from hotmail, yahoo, gmx, ...). Martin Maechler, ETH Zurich VE -- View this message in context: VE http://www.nabble.com/Why-is-the-R-mailing-list-so-hard-to-figure-out--tf3868661.html#a10965539 VE Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] opinions please: text editors and reporting/Sweave?
Jared == Jared O'Connell [EMAIL PROTECTED] on Thu, 31 May 2007 11:28:11 +0800 writes: Jared Winshell (http://www.winshell.de/) is another (free) option if you want a Jared Windows editor with good MikTEX integration. Looks like it. Note however that the above free is only as in free beer not as in free speech. In other words, Winshell is *not* 'Free Software' / 'Software Libre' nor is it Open Source Software. Tinn-R, as R itself, *is* Free Software (and Emacs and ESS and (La)TeX are too). Martin Maechler, ETH Zurich Jared On 5/31/07, Duncan Murdoch [EMAIL PROTECTED] wrote: Tim Howard wrote: dear all - I currently use Tinn-R as my text editor to work with code that I submit to R, with some output dumped to text files, some images dumped to pdf. (system: Windows 2K and XP, R 2.4.1 and R 2.5). We are using R for overnight runs to create large output data files for GIS, but then I need simple output reports for analysis results for each separate data set. Thus, I create many reports of the same style, but just based on different input data. I am recognizing that I need a better reporting system, so that I can create clean reports for each separate R run. This obviously means using Sweave and some implementation of LaTex, both of which are new to me. I've installed MikTex and successfully completed a demo or two for creating pdfs from raw LaTeX. It appears that if I want to ease my entry into the world of LaTeX, I might need to switch editors to something like Emacs (I read somewhere that Emacs helps with the TeX markup?). After quite a while wallowing at the Emacs site, I am finding that ESS is well integrated with R and might be the way to go. Aaaagh... I'm in way over my head! If you are used to Windows, you might find the shareware editors WinEdt or Textpad more familiar. WinEdt has advantages of lots of LaTeX integration. Duncan Murdoch My questions: What, in your opinion, is the simplest way to integrate text and graphics reports into a single report such as a pdf file. If the answer to this is LaTeX and Sweave, is it difficult to use a text editor such as Tinn-R or would you strongly recommend I leave behind Tinn and move over to an editor that has more LaTeX help? In reading over Friedrich Leisch's Sweave User Manual (v 1.6.0) I am beginning to think I can do everything I need with my simple editor. Before spending many hours going down that path, I thought it prudent to ask the R community. It is likely I am misunderstanding some of the process here and any clarifications are welcome. Thank you in advance for any thoughts. Tim Howard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Jared [[alternative HTML version deleted]] Jared __ Jared R-help@stat.math.ethz.ch mailing list Jared https://stat.ethz.ch/mailman/listinfo/r-help Jared PLEASE do read the posting guide http://www.R-project.org/posting-guide.html Jared and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to check for existence url from within a function?
Duncan == Duncan Murdoch [EMAIL PROTECTED] on Sat, 26 May 2007 08:02:11 -0400 writes: Duncan On 26/05/2007 7:13 AM, Heinz Tuechler wrote: Dear All, To check if an url exists, I can use try(). This works, as I expected, if I do it directly, as in the first part of the following example, but I could [..] Duncan .Last.value isn't set until your function returns. You should write this as Duncan con.url - try(url(url.string, open='rb')) Duncan try.error - inherits(con.url, try-error) Duncan Notice that I used inherits, rather than testing for equality. It's Duncan documented that the result of try() will be of class 'try-error' if an Duncan error occurs, but there may be circumstances (in the future?) where Duncan different types of errors are signalled by using a more complicated class. There's an additional reason for inherits(.,.) and hence against class(foo) == bar : Whenever try() does not catch an error, since there was none, i.e., the result of try(foobar(..)) is just foobar(..), foobar(..) may well return an object with an (S3) class vector of length 1. In those cases, the equality test returns a logical vector, typically c(FALSE, FALSE) and using that in if(.) gives at least a warning. Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ratio distribution - missing attachment
Thank you, Ravi. You probably will have noticed, that the attachment didn't make it to the mailing list. The reason is that the we let the mailing list software strip binary attachments which can easily be misused to spread viruses; see -- http://www.r-project.org/mail.html (search attachment) or the posting-guide. OTOH, the software allows attachments with MIME type text/plain. If you use an e-mail software for sophisticated users, the software allows you to specify the MIME type of your attachments; otherwise (as with most user friendly, modern e-mail software), attach a *.txt file (Doug Bates uses foo_R.txt) and it should make it to the lists; as a third alternative, just cut paste the corresponding text into your e-mal. I think your R function should make it to R-help (and its archives), so I'd be thankful for a repost. Martin Ravi == Ravi Varadhan [EMAIL PROTECTED] on Fri, 25 May 2007 14:24:20 -0400 writes: Ravi Mike, Attached is an R function to do this, along with Ravi an example that will reproduce the MathCad plot shown Ravi in your attached paper. I haven't checked it Ravi thoroughly, but it seems to reproduce the MathCad Ravi example well. Ravi Ravi. Ravi Ravi --- Ravi Ravi Varadhan, Ph.D. Ravi Assistant Professor, The Center on Aging and Health Ravi Division of Geriatric Medicine and Gerontology Ravi Johns Hopkins University Ravi Ph: (410) 502-2619 Ravi Fax: (410) 614-9625 Ravi Email: [EMAIL PROTECTED] Ravi Webpage: Ravi http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html Ravi Ravi Ravi -Original Message- From: Ravi [EMAIL PROTECTED] Ravi [mailto:[EMAIL PROTECTED] On Behalf Of Ravi Mike Lawrence Sent: Friday, May 25, 2007 1:55 PM To: Ravi Lucke, Joseph F Cc: Rhelp Subject: Re: [R] Calculation Ravi of ratio distribution properties Ravi According to the paper I cited, there is controversy Ravi over the sufficiency of Hinkley's solution, hence Ravi their proposed more complete solution. Ravi On 25-May-07, at 2:45 PM, Lucke, Joseph F wrote: The exact ratio is given in On the Ratio of Two Correlated Normal Random Variables, D. V. Hinkley, Biometrika, Vol. 56, No. 3. (Dec., 1969), pp. 635-639. Ravi -- Mike Lawrence Graduate Student, Department of Ravi Psychology, Dalhousie University Ravi Website: http://myweb.dal.ca/mc973993 Public calendar: Ravi http://icalx.com/public/informavore/Public Ravi The road to wisdom? Well, it's plain and simple to Ravi express: Err and err and err again, but less and less Ravi and less. - Piet Hein __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] File path expansion
Duncan == Duncan Murdoch [EMAIL PROTECTED] on Fri, 25 May 2007 15:38:54 -0400 writes: Duncan On 5/25/2007 1:09 PM, Prof Brian Ripley wrote: On Fri, 25 May 2007, Martin Maechler wrote: path.expand(~) [1] /home/maechler Yes, but beware that may not do what you want on Windows in R = 2.5.0, since someone changed the definition of 'home' but not path.expand. Duncan A more basic problem is that the definition of ~ Duncan in Windows is very ambiguous. Is it my Cygwin home Duncan directory, where cd ~ would take me while in Duncan Cygwin? most probably not (see below). The normal R Windows users needn't know about Cygwin. Duncan Is it my Windows CSIDL_PERSONAL folder, Duncan usually %HOMEDRIVE%/%HOMEPATH%/My Documents? Is it Duncan the parent of that folder, %HOMEDRIVE%/%HOMEPATH%? Duncan ~ is a shell concept that makes sense in Unix-like Duncan shells, but not in Windows. Hmm.. Let's just say ~ is a short cut for The user's home directory. And yes, that's been a Unix concept for ages, but I think Windows had adopted that concept, probably with the above %HOMEDRIVE%/%HOMEPATH% The fact that some of windows software may not work with the user's home directory (but rather a subdirectory of that), may be a separate issue, but then, I'm the windows-non-expert Martin Maechler __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] normality tests [Broadcast]
LuckeJF == Lucke, Joseph F [EMAIL PROTECTED] on Fri, 25 May 2007 12:29:49 -0500 writes: LuckeJF Most standard tests, such as t-tests and ANOVA, LuckeJF are fairly resistant to non-normalilty for LuckeJF significance testing. It's the sample means that LuckeJF have to be normal, not the data. The CLT kicks in LuckeJF fairly quickly. Even though such statements appear in too many (text)books, that's just plain wrong practically: Even though *level* of the t-test is resistant to non-normality, the power is not at all!! And that makes the t-test NON-robust! It's an easy exercise to see that lim T-statistic --- 1 when one observation goes to infinity, i.e., the t-test will never reject when you have one extreme outlier; simple proof with R: t.test(11:20) One Sample t-test data: c(11:20) t = 16.1892, df = 9, p-value = 5.805e-08 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 13.33415 17.66585 sample estimates: mean of x 15.5 ## --- unknown mean highly significantly different from 0 ## But t.test(c(11:20, 1000)) One Sample t-test data: c(11:20, 1000) t = 1.1731, df = 10, p-value = 0.2679 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -94.42776 304.42776 sample estimates: mean of x 105 LuckeJF Testing for normality prior to choosing a test LuckeJF statistic is generally not a good idea. Definitely. Or even: It's a very bad idea ... Martin Maechler, ETH Zurich LuckeJF -Original Message- From: LuckeJF [EMAIL PROTECTED] LuckeJF [mailto:[EMAIL PROTECTED] On Behalf LuckeJF Of Liaw, Andy Sent: Friday, May 25, 2007 12:04 PM LuckeJF To: [EMAIL PROTECTED]; Frank E Harrell Jr Cc: LuckeJF r-help Subject: Re: [R] normality tests [Broadcast] LuckeJF From: [EMAIL PROTECTED] On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: Hi all, apologies for seeking advice on a general stats question. I ve run normality tests using 8 different methods: - Lilliefors - Shapiro-Wilk - Robust Jarque Bera - Jarque Bera - Anderson-Darling - Pearson chi-square - Cramer-von Mises - Shapiro-Francia All show that the null hypothesis that the data come from a normal distro cannot be rejected. Great. However, I don't think it looks nice to report the values of 8 different tests on a report. One note is that my sample size is really tiny (less than 20 independent cases).Without wanting to start a flame war, are there any advices of which one/ones would be more appropriate and should be reported (along with a Q-Q plot). Thank you. Regards, Wow - I have so many concerns with that approach that it's hard to know where to begin. But first of all, why care about normality? Why not use distribution-free methods? You should examine the power of the tests for n=20. You'll probably find it's not good enough to reach a reliable conclusion. And wouldn't it be even worse if I used non-parametric tests? LuckeJF I believe what Frank meant was that it's probably LuckeJF better to use a distribution-free procedure to do LuckeJF the real test of interest (if there is one) instead LuckeJF of testing for normality, and then use a test that LuckeJF assumes normality. LuckeJF I guess the question is, what exactly do you want LuckeJF to do with the outcome of the normality tests? If LuckeJF those are going to be used as basis for deciding LuckeJF which test(s) to do next, then I concur with LuckeJF Frank's reservation. LuckeJF Generally speaking, I do not find goodness-of-fit LuckeJF for distributions very useful, mostly for the LuckeJF reason that failure to reject the null is no LuckeJF evidence in favor of the null. It's difficult for LuckeJF me to imagine why there's insufficient evidence to LuckeJF show that the data did not come from a normal LuckeJF distribution would be interesting. LuckeJF Andy Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University -- yianni __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. LuckeJF LuckeJF -- Notice: This e-mail message, together with LuckeJF any attachments,...{{dropped
Re: [R] Make check failure for R-2.4.1
Adam == Adam Witney [EMAIL PROTECTED] on Fri, 25 May 2007 09:38:29 +0100 writes: Adam Thanks for your replies Details inline below: Adam On 24/5/07 17:12, Martin Maechler [EMAIL PROTECTED] wrote: UweL == Uwe Ligges [EMAIL PROTECTED] on Thu, 24 May 2007 17:34:16 +0200 writes: UweL Some of these test are expected from time to time, since they are using UweL random numbers. Just re-run. eehm, some of these, yes, but not the ones Adam mentioned, d-p-q-r-tests.R. Adam, if you want more info you should report to us the *end* (last dozen of lines) of your d-p-q-r-tests.Rout[.fail] file. Adam Ok, here they are... [1] TRUE TRUE TRUE TRUE ##-- non central Chi^2 : xB - c(2000,1e6,1e50,Inf) for(df in c(0.1, 1, 10)) + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==1) Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE Execution halted Ok, thanks; so, if we want to learn more, we need the output of something like xB - c(2000,1e6,1e50,Inf) for(df in c(0.1, 1, 10)) for(ncp in c(0, 1, 10, 100)) print(pchisq(xB, df=df, ncp=ncp), digits == 15) UweL BTW: We do have R-2.5.0 these days. Indeed! And gcc 2.95.4 is also very old. Maybe you've recovered an old compiler / math-library bug from that antique compiler suite ? Adam Yes, maybe I should start think about upgrading this box! yes, at least start ... ;-) Adam Thanks again Adam adam __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make check failure for R-2.4.1
Adam == Adam Witney [EMAIL PROTECTED] on Fri, 25 May 2007 14:48:18 +0100 writes: ##-- non central Chi^2 : xB - c(2000,1e6,1e50,Inf) for(df in c(0.1, 1, 10)) + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==1) Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE Execution halted Ok, thanks; so, if we want to learn more, we need the output of something like xB - c(2000,1e6,1e50,Inf) for(df in c(0.1, 1, 10)) for(ncp in c(0, 1, 10, 100)) print(pchisq(xB, df=df, ncp=ncp), digits == 15) Adam Here is the results: xB - c(2000,1e6,1e50,Inf) for(df in c(0.1, 1, 10)) Adam +for(ncp in c(0, 1, 10, 100)) Adam +print(pchisq(xB, df=df, ncp=ncp), digits == 15) Adam Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15) : Adam object digits not found well, that's a typo - I think - you should have been able to fix (I said something like ...). Just do replace the '==' by '=' Martin Adam Thanks again... Adam adam __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] File path expansion
path.expand(~) [1] /home/maechler RobMcG == McGehee, Robert [EMAIL PROTECTED] on Fri, 25 May 2007 11:44:27 -0400 writes: RobMcG R-Help, RobMcG I discovered a mis-feature is ghostscript, which is used by the bitmap RobMcG function. It seems that specifying file names in the form ~/abc.png RobMcG rather than /home/directory/abc.png causes my GS to crash when I open RobMcG the bitmap device on my Linux box. RobMcG The easiest solution would seem to be to intercept any file names in the RobMcG form ~/abc.png and replace the ~ with the user's home directory. I'm RobMcG sure I could come up with something involving regular expressions and RobMcG system calls to do this in Linux, but even that might not be system RobMcG independent. So, I wanted to see if anyone knew of a native R solution RobMcG of converting ~ to its full path expansion. RobMcG Thanks, RobMcG Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] windows to unix
Erin == Erin Hodgess [EMAIL PROTECTED] on Fri, 25 May 2007 06:10:10 -0500 writes: Erin Dear R People: Erin Is there any way to take a Windows version of R, compiled from source, Erin compress it, and put it on a Unix-like environment, please? Since nobody has answered yet, let a die-hard non-windows user try : Just 'zip' the corresponding directory and copy the zip file to your unix like environment. I assume the only things this does not contain would be the - registry entries (which used to be optional anyway; I'm not sure if that's still true) - desktop links to R - startup menu links to R but the last two can easily be recreated after people copy the zip file and unpack it in their windows enviroment -- which I assume is the purpose of the whole procedure.. {Please reply to R-help; not me, I am *the windows-non-expert ..} Martin Erin thanks in advance, Erin Sincerely, Erin Erin Hodgess Erin Associate Professor Erin Department of Computer and Mathematical Sciences Erin University of Houston - Downtown Erin mailto: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make check failure for R-2.4.1
UweL == Uwe Ligges [EMAIL PROTECTED] on Thu, 24 May 2007 17:34:16 +0200 writes: UweL Some of these test are expected from time to time, since they are using UweL random numbers. Just re-run. eehm, some of these, yes, but not the ones Adam mentioned, d-p-q-r-tests.R. Adam, if you want more info you should report to us the *end* (last dozen of lines) of your d-p-q-r-tests.Rout[.fail] file. UweL BTW: We do have R-2.5.0 these days. Indeed! And gcc 2.95.4 is also very old. Maybe you've recovered an old compiler / math-library bug from that antique compiler suite ? Martin UweL Uwe Ligges UweL Adam Witney wrote: I'm trying to install R-2.4.1, everything configure's and make's OK, but the make check fails: running code in 'd-p-q-r-tests.R' ...make[3]: *** [d-p-q-r-tests.Rout] Error 1 make[3]: Leaving directory `/usr/local/install/R-2.4.1/tests' make[2]: *** [test-Specific] Error 2 make[2]: Leaving directory `/usr/local/install/R-2.4.1/tests' make[1]: *** [test-all-basics] Error 1 make[1]: Leaving directory `/usr/local/install/R-2.4.1/tests' make: *** [check] Error 2 This is Debian, gcc 2.95.4. My previous version R-2.1.0 installed ok. Any idea why this is failing? I have googled the errors, but couldn't find any resolutions Thanks for any help Adam __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reshape a sparse matrix
Scott == Scott Hyde [EMAIL PROTECTED] on Tue, 15 May 2007 17:03:13 -1000 (HST) writes: Scott Hi, Scott I'd like to reshape a sparse matrix generated from the Matrix package. I can't seem to do it with the command Scott dim(A) - c(6,9) Scott which works perfectly with the base package matrices, but with the sparse matrices it errors with Scott Error in dim(A) = c(6, 9) : dim- : invalid first argument This *does* work in the current version of Matrix (0.99875-1), actually already in version 0.99875-0 . In the next version of Matrix, it will not only work, but also work sparsely internally via the new class sparseVector and its daughter classes, on which I've been working during the last 10 days or so... Interesting that you bring the topic up right now ... Scott Manipulating the Dim attribute of the sparse Matrix does not produce the desired effect. [EMAIL PROTECTED] - c(as.integer(9),as.integer(6)) does not produce a column ordering result, which I am assuming is because the data is stored in a row (i) and column (j) format instead (class dgTMatrix) You should not have manipulate slots of S4 classes in general. Some people say that you should not even access them directly. Scott Does a function for this exist? yes, as I said above dim(.) - .. works in the newest versions of Matrix. Regards, Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S-plus coding
Rolf == Rolf Turner [EMAIL PROTECTED] on Fri, 4 May 2007 08:02:21 -0300 (ADT) writes: Rolf T. Kounouni wrote: Hi, how can i use data to forecast next time period value, if data has been influenced by a change in legislation? thank you. Rolf Well, you could use chicken entrails. Rolf cheers, Rolf Rolf Turner Rolf [EMAIL PROTECTED] Rolf P. S. What has this question got to do with ``S-plus coding''? Yes, indeed. And even then, what would S-plus coding have to do with R-help? Please Mr. Kounouni, do read the posting guide *before* *any* further posting to R-help! Rolf PLEASE do read the posting guide Rolf http://www.R-project.org/posting-guide.html ^^^ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] looking for equivalent of matlab's medfilt1 function
Vladimir == Vladimir Eremeev [EMAIL PROTECTED] on Mon, 7 May 2007 07:58:48 -0700 (PDT) writes: Vladimir Dear all, Vladimir I have several files with Matlab code, which I am translating to R. Vladimir For the zero-level approach, I took the very old Vladimir shell script from R-help archives, which has made Vladimir some obvious leg-work such as replacement of = Vladimir with -. Vladimir Now I am translating indexing, matrix operations and function call using Vladimir this table Vladimir http://37mm.no/mpy/octave-r.html You should also look at the 'matlab' package which defines quite a few R functions such as eyes(), zeros(), repmat(), to behave as the Matlab functions do. Vladimir The problem is, I cannot find the R equivalent of the matlab's function Vladimir medfilt1, performing 1-dimensional median filtering of a vector. Its summary Vladimir is here http://www-ccs.ucsd.edu/matlab/toolbox/signal/medfilt1.html To statisticians, this has been known as running medians, thanks to John Tukey. The smooth() function contains smart variations of running median of 3 which seems to be the matlab default. For 'k 3', I'd recommend the fast runmed(x, k) function which also has a bit more sophisticated end-point handling than Matlab's medfilt1() seems to provide. Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with the Matrix package
Tony == Tony Chiang [EMAIL PROTECTED] on Fri, 4 May 2007 00:07:04 +0100 writes: Tony Hi all, Tony I am wondering if this is a bug in the Matrix package Tony or if it something that I am just getting wrong... Tony here is an example: [..] It's a bug. A shorter example - w/o dimnames - and showing what the semantic really is for traditional matrices : a - matrix(0,4,4); a[c(1,2,1), 2] - 1:3; a [,1] [,2] [,3] [,4] [1,]0300 [2,]0200 [3,]0000 [4,]0000 A - Matrix(0,4,4); A[c(1,2,1), 2] - 1:3; A 4 x 4 sparse Matrix of class dgCMatrix [1,] . 4 . . [2,] . 2 . . [3,] . . . . [4,] . . . . so we see that multiple assignments are supposed to happen consecutively such that the last wins. Tony The documentation reads: Tony Most of the time, the function works via a traditional (_full_) Tony 'matrix'. However, 'Matrix(0, nrow,ncol)' directly constructs an Tony empty sparseMatrix, as does 'Matrix(FALSE, *)'. Tony So is this when an exception comes, no Tony and if so can someone explain to me why we get the 2? Tony It would seem that it should just reassign the 1 to a Tony 1 not add the number of times it is assigning a 1. If you are interested: Things go via TsparseMatrix, i.e. the triplet representation. and there the convention is the following: an index pair (i_0,j_0) can appear more than once. If it does it *means* that all the 'x' values are summed up. ?dgTMatrix-class does explain that, too. Here's an example - using the auxiliary function I had posted a while ago on R-help: spMatrix - function(nrow, ncol, i,j,x) { ## Purpose: User friendly construction of sparse Matrix from triple ## -- ## Arguments: (i,j,x): 2 integer and 1 numeric vector of the same length: ## ## The matrix M will have ## M[i[k], j[k]] == x[k] , for k = 1,2,..., length(i) ##and M[ i', j' ] == 0 for `` all other pairs (i',j') ## -- ## Author: Martin Maechler, Date: 8 Jan 2007, 18:46 dim - c(as.integer(nrow), as.integer(ncol)) ## The conformability of (i,j,x) with itself and with 'dim' ## will be checked automatically ## by an internal validObject() which is part of new(.): new(dgTMatrix, x = as.double(x), Dim = dim, ## our Tsparse Matrices use 0-based indices : i = as.integer(i - 1:1), j = as.integer(j - 1:1)) } and now uses this lower-level construction of Tsparse Matrices: (A - spMatrix(3,4, i= c(1:3, 1:2), j = c(2:4, 3:4), x = 1:5)) 3 x 4 sparse Matrix of class dgTMatrix [1,] . 1 4 . [2,] . . 2 5 [3,] . . . 3 ## ok (B - spMatrix(3,4, i= c(1:3, 1:2), j = c(2:4, 2:3), x = 1:5)) 3 x 4 sparse Matrix of class dgTMatrix [1,] . 5 . . [2,] . . 7 . [3,] . . . 3 ## oops! str(B) Formal class 'dgTMatrix' [package Matrix] with 6 slots ..@ i : int [1:5] 0 1 2 0 1 ..@ j : int [1:5] 1 2 3 1 2 ..@ Dim : int [1:2] 3 4 ..@ Dimnames:List of 2 .. ..$ : NULL .. ..$ : NULL ..@ x : num [1:5] 1 2 3 4 5 ..@ factors : list() ## which shows that you have 5 entries in B, with the sum those ## with equal index convention mentioned above. Thank you for the report, Tony. This will be fixed in the next release of Matrix. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Analysis for Binary time series
Megh == Megh Dal [EMAIL PROTECTED] on Fri, 4 May 2007 00:12:25 -0700 (PDT) writes: Megh hi, good morning everyone. I have a time series with Megh binary outputs like : Megh 000100100.etc. Now I want to Megh forecast the future values of that. Can anyone please Megh tell me whether there is any tools exist in literature Megh for dealing with this kind of binary observation? If Megh possible please provide me some good references in net Megh as well. Since you're asking on R-help : The R package VLMC (variable length markov chains) works with such finite alphabet time-series and has been applied to such successfully in the past. Do install.packages(VLMC) library(VLMC) ?vlmc It contains the following References Buhlmann P. and Wyner A. (1998) Variable Length Markov Chains. Annals of Statistics 27, 480--513. Mächler M. and Bühlmann P. (2004) Variable Length Markov Chains: Methodology, Computing, and Software. J. Computational and Graphical Statistics 2, 435--455. -- Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Wiki down?
Talbot == Talbot Katz [EMAIL PROTECTED] on Thu, 03 May 2007 12:35:27 -0400 writes: Talbot Hi. I can't access the site Talbot http://wiki.r-project.org/. I didn't find any Talbot notice about this on http://www.r-project.org/. Talbot Does anyone have any more information about the R Talbot Wiki status? Thanks! Yes, it is currently down. Thanks for the note. It's maintainer has hereby __ CC __ be notified as well. Regards, Martin Maechler ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Write text in the
Matt == Matthew Neilson [EMAIL PROTECTED] on Fri, 27 Apr 2007 15:54:20 +0100 writes: Matt Hey Felix, Matt So basically what you want is a figure containing a block of four plots, with a main title for the figure? If that's the case then something like this should work: Matt # BEGIN CODE # Matt par(oma=c(0,0,1,0), mfrow=c(2,2)) Matt for(i in 1:4){ Matt plot(NA,xlim=range(0,10),ylim=range(-5,5)) Matt title(paste(Plot ,i,sep=)) Matt } Matt par(mfrow=c(1,1), oma=c(0,0,1,0)) Matt mtext(Main Title, 3, outer = TRUE, cex = par(cex.main)) Matt # END CODE # Matt You can play about with the margins, but I think that's the general idea. Is this what you're after? Yes, and since this is so often desired, with our sfsmisc package, you can simply say sfsmisc::mult.fig(4, main = Main Title) for(i in 1:4){ plot(NA,xlim=range(0,10),ylim=range(-5,5)) title(paste(Plot ,i,sep=)) } If you're a good R-citizen, you will want to be able to reset the graphics parameters, which would extend the above to op - sfsmisc::mult.fig(4, main = Main Title) $ old.par for(i in 1:4){ plot(NA,xlim=range(0,10),ylim=range(-5,5)) title(paste(Plot ,i,sep=)) } par(op) -- Martin Maechler, ETH Zurich Matt On Fri Apr 27 15:34 , Felix Wave [EMAIL PROTECTED] sent: Hello, I started a graphic device: par(oma = c(2,0,0,0), mfrow=c(2,2) ) in the cols and rows are three images. Now I want to write a text in the device region, it's the main title of the graphic device. But when I use mtext() I can only write in the figure region of my four plots. Has anybody an idea? Thanks a lot. Felix My R Code: -- par(oma = c(2,0,0,0), mfrow=c(2,2) ) mtext(Main title, side = 3, line = 0) image(zDIV) image(zMEDIAN ) image(zMEAN) image(zSD) dev.off() graphic: --- |MAIN TITLE device region --- | figure region| figure region| | --|| | | ||| || | | ||| || | | ||| || | | ||| || | -- | | | --- | figure region| figure region| | --|| | | ||| || | | ||| || | | ||| || | | ||| || | -- | | | __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drawing Tangent
Arun == Arun Kumar Saha [EMAIL PROTECTED] on Thu, 26 Apr 2007 23:44:03 +0530 writes: Arun Dear all R-users, Arun I would like to draw a tangent of a given function for a particular (given) Arun point. However the straight line representing it should not cut any axis, it Arun should be a small line. Can anyone tell me how to do this? You will eventually call segments() to draw that short line. par(usr) {and maybe e.g. par(pin)} will probably be relevant when determining how long the line segment(s) should be. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pacf
tom == tom soyer [EMAIL PROTECTED] on Sat, 28 Apr 2007 08:15:39 -0500 writes: tom I wanted to understand exactly how acf and pacf works, tom so I tried to calculate ac and pac manually. For ac, I tom used the standard acf formula: acf(k) = tom sum(X(t)-Xbar)(X(t-k)-Xbar))/sum(X(t)-Xbar)^2. But for tom pac, I could not figure out how to calculate it by tom hand. I understand that in both R and EVIEWS, it is tom done using the Durbin-Levinson algorithm by the tom computer. However, I don't understand exactly how the tom algorithm works just by looking at the algorithm. Does tom anyone know if there is a short cut to calculate pac by tom hand (or in a spreadsheet), or is it too complex of a tom procedure that a computer is absolutely necessary? It tom seems that there should be a natural relationship tom between ac and pac so that once ac is calculated, pac tom can be easily calculated based on ac. easily, yes, by the Durbin-Levinson algorithm ;-) (is this a homework problem?) Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] digest readability/ thunderbird email client/ R-help
Hi Sam, SamMc == Sam McClatchie [EMAIL PROTECTED] on Fri, 27 Apr 2007 11:21:56 -0700 writes: SamMc System:Linux kernel 2.6.15 Ubuntu dapper ... SamMc Has anyone figured out how to make the R-help digest SamMc more easily readable in the Thunderbird mail client? SamMc I think the digest used to be more readable than is SamMc currently the case with all of the messages as SamMc attachments. SamMc I know the work around is to get individual messages SamMc rather than the digest, use another mail client, or SamMc just look at the archives on the web... {and there are at least two alternatives to the standard archives, notably Gmane. BTW: Just found an URL that should list all R lists carried by them : http://dir.gmane.org/search.php?match=.r. It had been pointed out more than once that Thunderbird unfortunately is not adhering to the standard(s) of such digests-- too bad it still is not. One alternative is to get the digest as plain digest ((which BTW another standard-complying digest format, that e.g. emacs Rmail or VM can easily deal with)) which will most probably just appear as one long e-mail in Thunderbird, with table of contents of all the subject lines, but nothing clickable Regards, Martin Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing working directory
Petr == Petr Klasterecky [EMAIL PROTECTED] on Mon, 23 Apr 2007 14:27:03 +0200 writes: Petr Hi, Petr you seem to have mixed 2 different things: Petr 1) changing a working directory - see ?setwd, ?getwd Petr However, this will NOT load another .Rdata file. Petr 2) loading data - see ?load and ?save, ?save.image - loading new data Petr image will erase all currently stored objects. Huuuh??? Not if you use the standard R function load()! Nothing is erased. If you load objects for which you have objects in the same name in your workspace, then and only then, those in the workspace will be replaced by the newly loaded ones. For that reason, attach(some_file.rda) is sometimes preferred. But, as Uwe Ligges has already said: Working with .Rdata files is not recommended: You should work with script files (foo.R), source() and possibly even own packages and only save() {and load()/attach()} things that are expensive to rebuild. And for those, I strongly recommend to use a filename different from .Rdata. Martin Maechler, ETH Zurich Petr Petr Petr Walter Paczkowski napsal(a): Good morning, I keep copies my .RData file in different directories for different projects on Windows XP. There is an icon on my desktop for each project so all I have to do is click on the icon to open R for a specific project, i.e. a specific .RData file. How do I change to another .RData file from within R without first closing R? Thanks, Walt Paczkowski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with length of array while calling C from R
Sven == Sven Knüppel [EMAIL PROTECTED] on Tue, 24 Apr 2007 13:53:09 +0200 writes: Sven Hello, Sven my problem is that I don't know how long must be an array of double while calling C from R. Sven R-Code: array - c(1,1,1) save - .C ( Ctest , a = array ) Sven C-Code: void Ctest ( double *array ) { ... array = Sven (double*) realloc ( array , new_number * Sven sizeof(double) ) ; ... } Sven The length of array will be compute in C. Sven At the end save$a has a length of 3 and not the length of the allocated array in C. Sven What can I do? Either you learn to use .Call() where you pass whole R objects, and can use length(.) on them in your C code, or, simpler in this case, but much less powerful in general, you change your R code to ss - .C(Ctest, a = myarray, n = length(myarray)) and the C code to void Ctest (double *array, int *n) { . } and then make use of *n inside the C code. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Do *NOT* repost ! {Re: about R square value}
Dear Nitish, Please do *NOT* resend your message several times to the R-help mailing list. This is considered very impolite. Nitish == Nitish Kumar Mishra [EMAIL PROTECTED] on Sun, 22 Apr 2007 16:39:03 +0530 (IST) writes: Nitish Hi, I am simply asking about coefficient od Nitish determination(R square), is its value more than 1 Nitish also posiible. Because it ranges from 0-1. So I Nitish want to know that R squre may be more than one. If Nitish yes what is its interpretation. Thanking to all of Nitish You(R help group). [] Nitish PLEASE do read the posting guide Nitish http://www.R-project.org/posting-guide.html and Nitish provide commented, minimal, self-contained, Nitish reproducible code. Please, DEFINITELY do read the posting guide before sending another message to R-help. Martin Maechler, ETH Zurich R- mailing lists maintainer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Filtering
Soare == Soare Marcian-Alin [EMAIL PROTECTED] on Fri, 20 Apr 2007 01:32:30 +0200 writes: Soare Hello Everybody, Soare How can I filter a dataset? Have you tried help(filter) ? filter() is a standard function and mentions others. Further note that statisticians often talk of smoothing when engineers talks of filtering Soare Im trying to filter the dataset EuStockMarkets monthly and quarter. Soare data(EuStockMarkets) Soare ftse = EuStockMarkets[,4] Soare Alin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Partitioning around mediods (PAM)
You've asked the almost identical question yesterday. {that *de*creases the probability of getting help in some cases!} If you take a little effort and follow the posting guide (keywords reproducible example; not using HTML in e-mails), I (and many others) gladly will help you further. Martin Maechler, ETH Zurich nathaniel == nathaniel Grey [EMAIL PROTECTED] on Fri, 20 Apr 2007 17:09:09 +0200 (CEST) writes: nathaniel Hi, nathaniel I need some help understanding the output from PAM. When I look at the output it doesn't list the cluster number by the median vlaues on each of the variables (like it does with k-means) Instead I have the following: nathaniel So I know for instance cluster 1 has a mean for variable1 of 33.33, however when I run PAM i get: nathaniel variable 1 variable2 nathaniel 293212 nathaniel 9712 9 nathaniel 308 106 8 nathaniel 217 62 2 nathaniel Does 29 relate to cluster 1, and 97 to cluster2 etc. nathaniel Hope this makes sense, nathaniel Nathaniel nathaniel ___ nathaniel [[alternative HTML version deleted]] nathaniel __ nathaniel R-help@stat.math.ethz.ch mailing list nathaniel https://stat.ethz.ch/mailman/listinfo/r-help nathaniel PLEASE do read the posting guide http://www.R-project.org/posting-guide.html nathaniel and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] colSum() in Matrix objects
Hi Jose, Jose I'd like to simply add column-wise using Matrix objects (Csparse). Jose It looks like one can apply mosty any base function to these objects Jose (i.e., apply, colSums), but there is a nasty conversion to traditional Jose matrix objects if one does that. not in this case, see below. Jose Is there any workaround? I can see colSum listed in the help for Class colSums (final 's'!) Jose 'CsparseMatrix' , but I wonder whether I'm using the default colSums() or Jose the one specific to CsparseMatrix... #example (z = Matrix(c(0,1,0,0), 10,10)) zr = rowSums(z) class(zr) # numeric; I'd like it to be a CSparseMatrix object selectMethod(colSums, class(z)) ## or showMethods(colSums) both show you that you are using the class specific one. However, why do you assume that colSums() should not return a numeric vector? From the idea that colSums() and rowSums() should be fast versions of apply(., marg, sum), it must return a numeric vector, as it also does for traditional matrices. Are your objects so huge that even a 1-row {or 1-column} sparse matrix would save a lot? Regards, Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nls.control( ) has no influence on nls( ) !
Yuchen == Yuchen Luo [EMAIL PROTECTED] on Sun, 15 Apr 2007 12:18:23 -0700 writes: Yuchen Dear Friends. Yuchen I tried to use nls.control() to change the 'minFactor' in nls( ), but it Yuchen does not seem to work. yes, you do not seem to use it correctly. No reason to jump to the conclusion that you give in the subject of this posting ... which is hilariously wrong! You don't really think that a quality software like R has had nls() and nls.control(), and nls.control() would never have worked for all those years and tens of thousands of users ??? Please get to your senses, and first read the posting guide (*) - and then try again, so we can help you further! Regards, Martin Maechler, ETH Zurich (*) Link at the bottom of every R-help message -- and hence cited below Yuchen I used nls( ) function and encountered error message step factor Yuchen 0.000488281 reduced below 'minFactor' of 0.000976563. I then tried the Yuchen following: Yuchen 1) Put nls.control(minFactor = 1/(4096*128)) inside the brackets of nls, Yuchen but the same error message shows up. Yuchen 2) Put nls.control(minFactor = 1/(4096*128)) as a separate command before Yuchen the command that use nls( ) function, again, the same thing happens, Yuchen although the R responds to the nls.control( ) function immediately: Yuchen - Yuchen $maxiter Yuchen [1] 50 Yuchen $tol Yuchen [1] 1e-05 Yuchen $minFactor Yuchen [1] 1.907349e-06 Yuchen -- Yuchen I am wondering how may I change the minFactor to a smaller value? The manual Yuchen that comes with the R software about nls( ) is very sketchy --- the only Yuchen relevant example I see is a separate command like 2). Yuchen A more relevent question might be, is lower the 'minFactor' the only Yuchen solution to the problem? What are the other options? Yuchen Best Wishes Yuchen Yuchen Luo Yuchen [[alternative HTML version deleted]] Yuchen __ Yuchen R-help@stat.math.ethz.ch mailing list Yuchen https://stat.ethz.ch/mailman/listinfo/r-help Yuchen PLEASE do read the posting guide http://www.R-project.org/posting-guide.html Yuchen and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Netiquette etc {was nls.control() has no influence on nls()!}
Joerg == Joerg van den Hoff [EMAIL PROTECTED] on Mon, 16 Apr 2007 11:15:23 +0200 writes: Joerg On Mon, Apr 16, 2007 at 09:03:27AM +0200, Martin Maechler wrote: Yuchen == Yuchen Luo [EMAIL PROTECTED] [] [] Joerg a final remark (off-topic?), concerning the response Joerg of m. maechler (but I notice this attitude Joerg continously on the list): this is supposed to be a Joerg help list, neither a boot camp nor a thought-police Joerg school, right?. In general, Joerg, you are entirely right, and I do behave like that typically. Now, since you responded in public, I do too: What I did not like at all about Yuchen's posting was it's subject which claims obvious misfunctioning, using a !, not a ? to make the point. The subject of an e-mail is the first and typically most read part and I'd like users to be a bit more polite there. {see also below} Joerg concerning the question at hand, the Joerg one word response Joerg `?nls' Joerg would have been justifiable (it usually _is_ in the Joerg manpage), if not overly helpful, probably. but the Joerg actual response is worded such that -- would it be Joerg directed at me -- I would judge it simply offensive: Joerg get to your senses is not what one should be told, Joerg simply because ond did not grasp how to use the Joerg `control' argument and/or because in the subject Joerg (contrary to the body) of the mail 'has no influence' Joerg is used instead of `seems to have no influence'. Joerg I really do think that if someone deems a certain Joerg question really stupid, it would be wise simply to Joerg keep 'radio silence' (i.e. ignore it), which probably Joerg is the best corrective anyway. often the most sensible, but then the users don't know why the silence is resounding... Joerg the 'how dare you' responses are annoying, to say the least. Joerg for heavens sake... well.. I am (often) glad to help users who ask questions about their problems, but sometimes *we* feel a bit annoyed by users shouting (that's what ! in a subject typically means) at the public and denouncing misfunctioning of something we've put in quite a bit of work. Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting and using a function
Thomas == Thomas L Jones [EMAIL PROTECTED] on Thu, 12 Apr 2007 22:28:33 -0500 writes: Thomas I am trying to do what is perhaps the most basic procedure which can be done Thomas with the R software. Thomas Under Windows XP Home Edition, I want to get a copy of the function gam, Thomas then put it in and use it. I intentionaly use informal terms, rather than Thomas technical terms whose exact meaning I might or might not know. Thomas I am finding this extremely frustrating. Every time Thomas I try to do anything, all I get is error Thomas messages. So have you really read (and tried to understand) the Introduction to R that's web- available and comes built into R? Thomas only three or four or five steps involved. Well, Thomas what are they? There is all this opaque Thomas terminology. There are libraries, and packages, and Thomas one downloads them, loads them, and installs them, Thomas but just what all this means is unclear. Thomas One example among many: I tell it Thomas library (gam) all I get is an error message. Thomas Error in library (gam) : there is no package called 'gam' Thomas Well, does this mean what it says, or does it mean something different? For Thomas example, does it mean that such-and-such computer program has not yet been Thomas downloaded? Well it means that for your version of R there is no package called 'gam' available, i.e. what it says or do you expect software to tell you something about the world at large, namely that there is no R package named 'gam' on the whole wide world ? :-) [ BTW, for using a more modern version of gam(), there's the recommended package 'mgcv' - which you don't have to install : library(mgcv) help(gam) ] Thomas I did download and install the 2.4.1 flavor Thomas (version?) of the gui. I infer, reading between the Thomas lines a bit, that there may be a sort of standard Thomas procedure for setting things up, perhaps downloading Thomas and installing the utils package, or something. Thomas The R software has much gold in it, but, as far as Thomas learnability/usability is concerned, I give it poor marks. Well it is recommended that you learn it from a human rather than from itself. One way to learn from humans is by reading books, manuals, and for R, there are quite a few, both freely available with R and on the web, and others in a bookstore near you ... A more comfortable way for some is to attend a course / class... I hope you can bear the somewhat flippy tone of my answer,... I'm just trying to counter your own humorous one ;-) :-) Regards, Martin Thomas Tom Jones Thomas DrJones at alum.MIT.edu Thomas __ Thomas [EMAIL PROTECTED] mailing list Thomas https://stat.ethz.ch/mailman/listinfo/r-help Thomas PLEASE do read the posting guide http://www.R-project.org/posting-guide.html that also may be a quite useful resource to read ... __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] median polishing
Sorn == Sorn Norng [EMAIL PROTECTED] on Tue, 10 Apr 2007 16:11:38 +1000 writes: Sorn Hi, In SPlus there is a function called twoway for Sorn median polishing gridded data. Is there an equivalent Sorn function in R? yes Sorn I have been searching for it in R help Sorn without much success. hmm... funny.. For me, help.search(median polish) pretty quickly gives the hint you need: medpolish() Sorn Your help is much appreciated. you're welcome. Martin Sorn [[alternative HTML version deleted]] Please do read the posting guide -- particularly about HTML-ified e-mails. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data encapsulation with classes
BDR == Prof Brian Ripley [EMAIL PROTECTED] on Mon, 9 Apr 2007 08:24:32 +0100 (BST) writes: BDR On Sun, 8 Apr 2007, hadley wickham wrote: On 4/8/07, Roger Bivand [EMAIL PROTECTED] wrote: On Sun, 8 Apr 2007, Peter Williams wrote: Hi All, I'm new to R from a C and Octave/Matlab background. I am trying to construct some classes in R to which I want to attach pieces of data. First, is attr(obj, 'member name') - data the accepted way of doing this? No, it isn't. You seem to be trying to deduce new-style classes from a representation used before R 2.4, BDR (actually, still used) but in any case it would not be sensible. Please consult John M. Chambers. Programming with Data. Springer, New York, 1998, and/or William N. Venables and Brian D. Ripley. S Programming. Springer, New York, 2000, or for a shorter online resource: http://www.stat.auckland.ac.nz/S-Workshop/Gentleman/Methods.pdf Unfortunately, all of those references are at least 4 years out of date when it comes to S4 methods. Is there any comprehensive reference of the current implementation of the S4 OO system apart from the source code? ?Methods {i.e. help(Methods)} is not about the *implementation* but still explains quite a bit about the method dispatch part of S4. It has had a URL to for the How S4 Methods Work technical report by John Chambers; I've now also added a link to that from http://developer.R-project.org/ (will only be active in ~ a day). Martin BDR Not that I know of, and it is a moving target. (E.g. I BDR asked recently about some anomalies in the S4 bit BDR introduced for 2.4.0 and what the intended semantics BDR are.) I've said before that I believe we can only help BDR solve some of the efficiency issues with S4 if we have BDR a technical manual. BDR It is unfair to pick out S4 here, but the 'R Internals' BDR manual is an attempt to document important BDR implementation details (mainly by studying the code), BDR and that has only got most of the way through BDR src/main/*.c. BDR -- Brian D. Ripley, [EMAIL PROTECTED] Professor of BDR Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ BDR University of Oxford, Tel: +44 1865 272861 (self) 1 BDR South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, BDR UK Fax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Computing the rank of a matrix.
Ravi == Ravi Varadhan [EMAIL PROTECTED] on Fri, 6 Apr 2007 12:44:33 -0400 writes: Ravi Hi, qr(A)$rank will work, but just be wary of the Ravi tolerance parameter (default is 1.e-07), since the Ravi rank computation could be sensitive to the tolerance Ravi chosen. Yes, indeed. The point is that rank(Matrix) is well defined in pure math (linear algebra), as well as a singular matrix is. The same typically no longer makes sense as soon as you enter the real world: A matrix close to singular may have to be treated as if singular depending on its singularity closeness {{ learn about the condition number of a matrix }} and the same issues arise with rank(matrix). Of course, the matlab programmers know all this (and much more), and indeed, matlab's rank(A) really is rank(A, tol = tol.default(A)) and is based on the SVD instead of QR decomposition since the former is said to be more reliable (even though slightly slower). R's equivalent (with quite a bit of fool-proofing) would be the following function (assuming correct online documentation of matlab): rankMat - function(A, tol = NULL, singValA = svd(A, 0,0)$d) { ## Purpose: rank of a matrix ``as Matlab'' ## -- ## Arguments: A: a numerical matrix, maybe non-square ## tol: numerical tolerance (compared to singular values) ## singValA: vector of non-increasing singular values of A ## (pass as argument if already known) ## -- ## Author: Martin Maechler, Date: 7 Apr 2007, 16:16 d - dim(A) stopifnot(length(d) == 2, length(singValA) == min(d), diff(singValA) 0) # must be sorted decreasingly if(is.null(tol)) tol - max(d) * .Machine$double.eps * abs(singValA[1]) else stopifnot(is.numeric(tol), tol = 0) sum(singValA = tol) } A small scale simulation with random matrices, i.e., things like ## ranks of random matrices; here will have 5 all the time: table(replicate(1000, rankMat(matrix(rnorm(5*12),5, 12) )))# 1 sec. indicates that qr(.)$rank gives the same typically, where I assume one should really use qr(., tol = .Machine$double.eps, LAPACK = TRUE)$rank to be closer to Matlab's default tolerance. Ok, who has time to investigate further? Research exercise: 1 Is there a fixed number, say t0 - 1e-15 1for which qr(A, tol = t0, LAPACK=TRUE)$rank is 1 ``optimally close'' to rankMat(A) ? 2 how easily do you get cases showing svd(.) to more reliable 2 than qr(., LAPACK=TRUE)? To solve this in an interesting way, you should probably investigate classes of almost rank-deficient matrices, and I'd also be interested if you randomly ever get matrices A with rank(A) min(dim(A)) - 1 (unless you construct some columns/rows exactly from earlier ones or use all-0 ones) Martin Maechler, ETH Zurich Ravi Ravi. Ravi - Ravi Ravi Varadhan, Ph.D. Ravi Assistant Professor, The Center on Aging and Health ... Ravi http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html Ravi How about qr(A)$rank or perhaps qr(A, LAPACK=TRUE)$rank Cheers, Andy Hi! Maybe this is a silly question, but I need the column rank (http://en.wikipedia.org/wiki/Rank_matrix) of a matrix and R function 'rank()' only gives me the ordering of the elements of my matrix. How can I compute the column rank of a matrix? Is there not an R equivalent to Matlab's 'rank()'? I've been browsing for a time now and I can't find anything, so any help will be greatly appreciated. Best regards! -- -- Jose Luis Aznarte M. http://decsai.ugr.es/~jlaznarte Department of Computer Science and Artificial Intelligence Universidad de Granada Tel. +34 - 958 - 24 04 67 GRANADA (Spain) Fax: +34 - 958 - 24 00 79 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using split() several times in a row?
SteT == Stephen Tucker [EMAIL PROTECTED] on Fri, 30 Mar 2007 18:41:39 -0700 (PDT) writes: [..] SteT For dates, I usually store them as POSIXct classes SteT in data frames, but according to Gabor Grothendieck SteT and Thomas Petzoldt's R Help Desk article SteT http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf, SteT I should probably be using chron date and times... I don't think you should (and I doubt Gabor and Thomas would recommend this in every case): POSIXct (and 'POSIXlt', 'POSIXt' 'Date') are part of standard R, and whereas they may seem not as convenient in all cases as chron etc, I'd rather recommed to stick to them in such a case. SteT Nonetheless, POSIXct casses are what I know so I can SteT show you that to get the month out of your column SteT (replace 8.29.97 with your variable), you can do the SteT following: SteT month = format(strptime(8.29.97,format=%m.%d.%y),format=%m) SteT Or, SteT month = as.data.frame(strsplit(8.29.97,\\.))[1,] [..etc..veryuseful..advice] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CLUSTER Package
It seems nobody else was willing to help here (when the original poster did not at all follow the posting guide). In the mean time, someone else has asked me about part of this, so let me answer in public : MM == Martin Maechler [EMAIL PROTECTED] on Mon, 12 Mar 2007 17:23:30 +0100 writes: MM Hi Vallejo, I'm pretty busy currently, and feel your MM question has much more to do with how to use R more MM generally than with using the functions from the cluster MM package. MM So you may get help from other R-help readers, but maybe MM only after you have followed the posting-guide and give MM a reproducible example as you're asked there. MM Regards, Martin Maechler VallejoR == Vallejo, Roger [EMAIL PROTECTED] on Mon, 12 Mar 2007 10:28:01 -0400 writes: VallejoR Hi Martin, In using the Cluster Package, I have VallejoR results for PAM and DIANA clustering algorithms VallejoR (below part and hier objects): VallejoR part - pam(trout, bestk) # PAM results VallejoR hier - diana(trout) # DIANA results VallejoR GeneNames - show(RG$genes) # Gene Names are in this object (RG is what)? VallejoR But I would like also to know what genes (NAMES) VallejoR are included in each cluster. I tried VallejoR unsuccessfully to send these results to output VallejoR files (clusters with gene Names). This must be an VallejoR easy task for a good R programmer. I will VallejoR appreciate very much directions or R code on how VallejoR to send the PAM and DIANA results to output files VallejoR including information on genes (Names) per each VallejoR cluster. For diana(), a *hierarchical* clustering {as agnes()}, you need to decide about the number of clusters yourself. Then, as the example in help(diana.object) shows, you can use cutree() to get the grouping vector: Here's a reproducible example : library(cluster) data(votes.repub) dv - diana(votes.repub, metric = manhattan, stand = TRUE) print(dv) plot(dv) ## Cut into 2 groups: dv2 - cutree(as.hclust(dv), k = 2) table(dv2) # 8 and 42 group members rownames(votes.repub)[dv2 == 1] ## For two groups, does the metric matter ? dv0 - diana(votes.repub, stand = TRUE) # default: Euclidean dv.2 - cutree(as.hclust(dv0), k = 2) table(dv2 == dv.2)## identical group assignments For pam(), it's even simpler : data(ruspini) pr - pam(ruspini, 4) plot(pr) # Hit Return to see next plot: str(pr) ## or summary(pr) ## .. shows you that there's a component 'clustering' : pr$clustering ## a grouping vector with case-labels {your Gene names}; here 1,2,..150: ## and to get them ``visually'': split(rownames(ruspini), pr$clustering) ## $`1` ## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ## [16] 16 17 18 19 20 ## $`2` ## [1] 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 ## [16] 36 37 38 39 40 41 42 43 ## $`3` ## [1] 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 ## [16] 59 60 ## $`4` ## [1] 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Implementing trees in R
Bálint == Bálint Czúcz [EMAIL PROTECTED] on Wed, 21 Mar 2007 19:54:28 +0100 writes: Bálint for an example how this idea is implemented in a working package, have Bálint a look at the help page of BinaryTree class in package party. The Bálint @tree node of such an object is a recursive list of this kind. and the informal (i.e. S3) class dendrogram in the 'stats' package, i.e. part of every R, is of that kind as well {to be used for cluster dendrograms it has attributes for nodes and edges, etc} It features a print(), (too ?!) flexible plot(), a nice str() method, and more: methods(class = dendrogram) [1] cophenetic.dendrogram* cut.dendrogram*[[.dendrogram* [4] labels.dendrogram* plot.dendrogram* print.dendrogram* [7] reorder.dendrogram*rev.dendrogram*str.dendrogram* Type example(dendrogram) to get a first impression. Martin Maechler, ETH Zurich Bálint On 3/16/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: Let me rephrase that. Lists do not support references but they could be used to represent trees. list(a = list(a = 1, b = list(2, 3, d = list(4, 5)), c = list(4, 5)) is a tree whose top nodes are a, b, c and b contains three nodes 2, 3 and d and d contains 2 nodes. However, if you want to do it via references as requested then lists are not appropriate. On 3/16/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: Lists are not good for this. There is an example in section 3.3 of the proto vignette of using proto objects for this. That section also references an S4 example although its pretty messy with S4. You might want to look at the graph, RBGL and graphviz packages in Bioconductor and the dynamicgraph, mathgraph and sna packages on CRAN. On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote: Hi all, I am rather new to R. Recently I have been trying to implement some tree algorithms in R. I used lists to model tree nodes. I thought something like this would work: parent - list(); child - list(); parent$child1 - child; child$parent - parent; When I tried to check whether a node is its parent's first child using if (node$parent$child1 == node), it always returned false. Then I realized that it does not really work because parent$child1 - child actually makes a copy of child instead of referencing it. I think one possible fix is to keep a list of node objects, and make references using the positions in the list. For example, I think the following would work: parent - list(); child - list(); nodes - list(parent, child); parent$child1 - 2; child$parent - 1; Then the first child test can be rewritten as if (nodes[[nodes[[nodeId]]$parent]]$child1 == nodeId). However, I would prefer not to implement trees in this way, as it requires the inconvenient and error-prone manipulations of node IDs. May I know if there is a way to make object references to lists? Or are there other ways to implement tree data structures in R? BTW, I checked how hclust was implemented, and noticed that it calls an external Fortran program. I would want a solution not involving any external programs. Thanks. -- God bless. Kevin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Bálint __ Bálint R-help@stat.math.ethz.ch mailing list Bálint https://stat.ethz.ch/mailman/listinfo/r-help Bálint PLEASE do read the posting guide http://www.R-project.org/posting-guide.html Bálint and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ticks and labels on plots
Marc == Marc Schwartz [EMAIL PROTECTED] on Wed, 21 Mar 2007 14:23:12 -0500 writes: Marc On Wed, 2007-03-21 at 14:40 -0400, Mike Prager wrote: Marc Schwartz [EMAIL PROTECTED] wrote: On Tue, 2007-03-20 at 18:04 -0400, Michael H. Prager wrote: I am generating stacked barplots of age-composition of fish populations (Y) over time (X). As there are many years, not every bars is labeled. When looking at the plot, it becomes difficult to associate labels with their bars. We have improved this a bit by using axis() to add a tickmark below each bar. Can anyone suggest a way to draw ticks ONLY at bars where a tick label is drawn? Or to make such ticks longer than those where there is no label? This is going into a function, so I'm hoping for a method that doesn't require looking at the plot first. # sample code (simplified) # mp - barplot(t(N.age), xlab = Year, axisnames = FALSE) axis(side = 1, at = mp, labels = rownames(N.age), tcl = -0.75) Thanks! Mike Prager NOAA, Beaufort, NC Mike, How about something like this: mp - barplot(1:50, axisnames = FALSE) # Create short tick marks at each bar axis(1, at = mp, labels = rep(, 50), tcl = -0.25) # Create longer tick marks every 5 years with labels axis(1, at = mp[seq(1, 50, 5)], labels = 1900 + seq(0, 45, 5), tcl = -0.75, las = 2, cex.axis = 0.75) Just pick which labels you want to be shown (eg. every 5 years) and synchronize the values of those with the 'at' argument in axis(). HTH, Marc Schwartz Thanks, Marc, for this solution and thanks equally to Jim Lemon for a similar idea. This seems promising. Since this is to go into a function (and should work without intervention), I'll need to devise an algorithm to decide at what interval the labels should be plotted. Clearly axis() has such an algorithm. Unfortunately, it reports its result only by placing the labels. Mike Marc Mike, Marc To get a feel for how axis() creates the default tick positions when Marc 'at' is the default NULL, see ?axTicks, which provides functionality Marc similar to the internal C routine. yes, partly not only similar but the same when it works (by default) with par(axp) Marc You could also look at ?pretty Yes. *However* there's one important thing which I think hasn't been mentioned yet. We have now been talking how and where axis() {i.e. its internal code} chooses to place tick marks. The clue is that it draws labels at all tick marks by default or explicitly with axis(*, at= ., labels = .) BUT the labels are not shown on your plot as soon as they would get too much crammed together. You can see this nicely, interactively, by the following: Use graphics.off() plot(1:11) This will show ticks *and* labels at 2 , 4 , 6 , 8, 10 and now use your mouse, drag to make the graphics window narrower (e.g. keeping height constant), in small steps, releasing the mouse again to let R redraw the graphic. For quite a while, the labels remain until there's not enough room, the 5 ticks remain, but only the labels 2 , 4, 6, 8 are drawn then2 , 6 , 10 then2 , 6 then2 , 8 ( a little surprise to me) then2 you always see all ticks but labels are only drawn when they don't get in each other's way. Of course, things like plot(1:11, xaxt=n) axis(1, at=1:11, labels = paste(Lab, 1:11)) show even more when the window is widened or narrowed, and yes, it depends on the device and on the fonts and its sizes, see e.g., plot(1:11, xaxt=n) axis(1, at=1:11, labels = paste(Lab, 1:11), cex.axis = 0.5) - - If you don't like this --- almost always very desirable --- builtin smartness, you need to descend slightly lower level and use mtext(), e.g., the analogue of the above is plot(1:11, xaxt=n) mtext(side=1, at=1:11, text = paste(Lab, 1:11), cex = 0.5) or rather leave traditional graphics which really gets messy in such cases {par() .. etc} and upgrade to using the grid package, or also packages built on grid, lattice or ggplot. But infact, someone else needs to tell you how to play similar goes with these. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with RCurl install on Unix
Steven == Steven McKinney [EMAIL PROTECTED] on Wed, 21 Mar 2007 19:29:43 -0700 writes: Steven I get the same problem, and haven't Steven figured it out yet. Steven Is it a 32bit/64bit clash? Well, I can install and run RCurl on both 32bit and 64bit (Redhat / FC6 Linux; with own compilers, extra libs, ...). Steven (Similarly, I don't have RMySQL up and running cleanly.) library(RCurl) Steven Error in dyn.load(x, as.logical(local), as.logical(now)) : Steven unable to load shared library '/share/apps/R/R-2.4.1/library/RCurl/libs/RCurl.so': Steven libcurl.so.4: cannot open shared object file: No such file or directory Steven Error: package/namespace load failed for 'RCurl' You might need to set LD_LIBRARY_PATH correctly before starting R -- typically it would be set the same as it was when RCurl was installed (which includes a 'configure' !) on your system? Or RCurl's configure is not quite robust enough and did not check for the presence of a libcurl.so.4 I assume ldd /share/apps/R/R-2.4.1/library/RCurl/libs/RCurl.so also tells about the missing libcurl.so.4 ? Make sure you find that (in /usr/lib; /usr/local/lib, ... ?) and then set your LD_LIBRARY_PATH or even 'ldconfig' as root to make sure that libcurl.so.4 ``is found''. IMO the latter {correct ldconfig call / /etc/ld.so.conf setup} should have happened as part of the installation of the curl library. On the 64-bit architecture, note that system(paste(ldd, dir(system.file(libs, package = RCurl), full=TRUE))) finds all libraries in /lib64 and /usr/lib64 . Martin Maechler, ETH Zurich sessionInfo() Steven R version 2.4.1 (2006-12-18) Steven x86_64-unknown-linux-gnu Steven locale: Steven LC_CTYPE=en_US.iso885915;LC_NUMERIC=C;LC_TIME=en_US.iso885915;LC_COLLATE=en_US.iso885915;LC_MONETARY=en_US.iso885915;LC_MESSAGES=en_US.iso885915;LC_PAPER=en_US.iso885915;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.iso885915;LC_IDENTIFICATION=C Steven attached base packages: Steven [1] stats graphics grDevices utils datasets methods Steven [7] base Steven other attached packages: Steven DBI Steven 0.1-12 Steven Steven McKinney Steven Statistician Steven Molecular Oncology and Breast Cancer Program Steven British Columbia Cancer Research Centre Steven email: [EMAIL PROTECTED] Steven tel: 604-675-8000 x7561 Steven BCCRC Steven Molecular Oncology Steven 675 West 10th Ave, Floor 4 Steven Vancouver B.C. Steven V5Z 1L3 Steven Canada Steven -Original Message- Steven From: [EMAIL PROTECTED] on behalf of Mark W Kimpel Steven Sent: Wed 3/21/2007 3:09 PM Steven To: r-help@stat.math.ethz.ch Steven Subject: [R] problem with RCurl install on Unix Steven I am having trouble getting an install of RCurl to work properly on a Steven Unix server. The steps I have taken are: Steven 1. installed cUrl from source without difficulty Steven 2. installed RCurl from source using the command Steven ~/R_HOME/R-devel/bin/R CMD INSTALL -l ~/R_HOME/R-devel/library Steven ~/RCurl_0.8-0.tar.gz I received no errors during this install Steven 3. when I go back to R and require(RCurl), I get require(RCurl) Steven Loading required package: RCurl Steven Error in dyn.load(x, as.logical(local), as.logical(now)) : Steven unable to load shared library Steven '/N/u/mkimpel/BigRed/R_HOME/R-devel/library/RCurl/libs/RCurl.so': Steven libcurl.so.4: cannot open shared object file: No such file or directory Steven °1§ FALSE Steven Outside of R I get Steven mkimpelàBigRed:¨/R_HOME/R-devel/library/RCurl/libs ls Steven RCurl.so Steven I can cat into RCurl and I have even done chmod a+x RCurl.so in case Steven there was a problem with permission for R to open the file. Steven Below is my sessionInfo. Thanks, Mark sessionInfo() Steven R version 2.5.0 Under development (unstable) (2007-03-11 r40824) Steven powerpc64-unknown-linux-gnu Steven locale: Steven LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C Steven attached base packages: Steven °1§ stats graphics grDevices datasets utils tools Steven °7§ methods base Steven other attached packages: Steven limma affyaffyio Biobase Steven 2.9.13 1.13.14 1.3.3 1.13.39 Steven -- Steven Mark W. Kimpel MD Steven Neuroinformatics Steven Department of Psychiatry Steven Indiana University School of Medicine Steven __ Steven R-help@stat.math.ethz.ch mailing list Steven https://stat.ethz.ch/mailman/listinfo
Re: [R] truehist bug?
Gad == Gad Abraham [EMAIL PROTECTED] on Tue, 20 Mar 2007 17:02:18 +1100 writes: Gad Hi, Gad Is this a bug in truehist()? library(MASS) x - rep(1, 10) truehist(x) Gad Error in pretty(data, nbins) : invalid 'n' value You get something similar though slightly more helpful from hist(x, scott) which then uses the same method for determining the number of bins / classes for the histogram. I'd say the main bug is in nclass.scott() [ and also nclass.FD() ] which do not work when var(x) == 0 as in this case. One could argue that 1) truehist(x) should not use scott as default when var(x) == 0 {hence a buglet in truehist()} and either 2) both hist() and truehist() should produce a better error message when scott (or FD) is used explicitly and var(x) == 0 or, rather IMO, 3) nclass.scott(x) and nclass.FD(x) should be extended to return a non-negative integer even when var(x) == 0 Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nclass.scott() and nclass.FD() {Re: truehist bug?}
[This has become entirely a topic for 'R-devel' hence, I'm diverting to there keeping R-help once; please follow-up only to R-devel ] MM == Martin Maechler [EMAIL PROTECTED] on Tue, 20 Mar 2007 08:49:16 +0100 writes: Gad == Gad Abraham [EMAIL PROTECTED] on Tue, 20 Mar 2007 17:02:18 +1100 writes: Gad Hi, Gad Is this a bug in truehist()? library(MASS) x - rep(1, 10) truehist(x) Gad Error in pretty(data, nbins) : invalid 'n' value MM You get something similar though slightly more helpful MM from MMhist(x, scott) MM which then uses the same method for determining the number of bins / MM classes for the histogram. MM I'd say the main bug is in MM nclass.scott() [ and also nclass.FD() ] MM which do not work when var(x) == 0 as in this case. MM One could argue that MM 1) truehist(x) should not use scott as MM default when var(x) == 0 {hence a buglet in truehist()} MM and either MM 2) both hist() and truehist() should produce a better error MM message when scott (or FD) is used explicitly and var(x) == 0 MM or, rather IMO, MM 3) nclass.scott(x) and nclass.FD(x) should be extended to return a MM non-negative integer even when var(x) == 0 after some further thought, I'm proposing to adopt '3)' {only; '1)' becomes unneeded} with the following new code which is back-compatible for the case where 'h 0' and does something reasonable for the case h == 0 : nclass.scott - function(x) { h - 3.5 * sqrt(stats::var(x)) * length(x)^(-1/3) if(h 0) ceiling(diff(range(x))/h) else 1L } nclass.FD - function(x) { h - stats::IQR(x) if(h == 0) h - stats::mad(x, constant = 2) # c=2: consistent with IQR if (h 0) ceiling(diff(range(x))/(2 * h * length(x)^(-1/3))) else 1L } Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expm() within the Matrix package
Gad == Gad Abraham [EMAIL PROTECTED] on Mon, 19 Mar 2007 09:36:15 +1100 writes: Gad If you convert to numeric, you can then assign it to Loglik: Loglik[1] - as.numeric(log(p %*% expm(Q * y[i]) %*% q)) Loglik[1] Gad [1] 134.5565 Hmm, I don't think that's Laura's problem (and actually I don't know what her problem is) : Assignment of a 1 x 1 matrix to a vector is not a problem, and hence the as.numeric(.) above really has no effect : ll - 1:2 (m - matrix(pi, 1,1)) [,1] [1,] 3.141593 ll[1] - m ll [1] 3.141593 2.00 Gad Ah but you're using 'matrix' while she's using 'Matrix' Gad (AFAIK there is no expm for matrix): Yes, of course, you are absolutely right (and I'm pretty embarrassed). Martin library(Matrix) Gad Loading required package: lattice m - Matrix(1, nrow=1, ncol=1) m Gad 1 x 1 diagonal matrix of class ddiMatrix Gad [,1] Gad [1,]1 Gad Warning message: Gad Ambiguous method selection for diag, target ddiMatrix (the first of Gad the signatures shown will be used) Gad diagonalMatrix Gad ddenseMatrix Gad in: .findInheritedMethods(classes, fdef, mtable) a - vector(numeric, 0) a[1] - m Gad Error in a[1] - m : incompatible types (from S4 to double) in Gad subassignment type fix a[1] - as.numeric(m) a Gad [1] 1 [.] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question on table format
use print(your.character.matrix, quote = FALSE) HTH, Martin Norbert == Norbert NEUWIRTH [EMAIL PROTECTED] on Mon, 19 Mar 2007 13:20:57 +0100 writes: Norbert hi everybody, Norbert i am hanging in a problem that should be solved quickly, but, in fact, i do not get further: building up a SUR/SES-system myself, where the regressions are finally used to be poisson and/or negative-binomial type, i want to generate a simple table, where i can quickly see the coefficients and their markers for significance. Norbert so i constructed this final.table (code outout below). as the markers for significance are character-type, the content of each cell in the whole table appears as character with quotation marks.these quotations marks should be suppressed. how can i handle this? Norbert thank you in advance Norbert norbert Norbert --- Norbert code: Norbert final.table3 - as.matrix (cbind( Norbert round(coef.sur5.lm[,1],4),sigc.sur5.lm[,1], Norbert round(coef.sur5.lm[,2],4),sigc.sur5.lm[,2], Norbert round(coef.sur5.lm[,3],4),sigc.sur5.lm[,3], Norbert round(coef.sur5.lm[,4],4),sigc.sur5.lm[,4], Norbert round(coef.sur5.lm[,5],4),sigc.sur5.lm[,5] Norbert )) Norbert colnames(final.table3) - c(MW,sig., Norbert HP,sig., Norbert CC,sig., Norbert AL,sig., Norbert RC,sig.) Norbert --- Norbert output: Norbert MWsig. HPsig. CCsig. ALsig. RC sig. Norbert (Intercept) 6.7757 *** -2.2851 *** -0.067 8.517 *** 11.0767 *** Norbert FEMALE -1.6267 *** 2.4045 *** 0.2892 *** -1.0887 *** 0.0236 Norbert AGE -0.0324 . 0.1923 *** 0.0158 *** -0.0984 *** -0.0778 *** Norbert I(AGE^2)1e-04 -0.0019 *** -2e-04 *** 0.0011 *** 0.001 *** Norbert C2.H-0.2589 * 0.2051 ** 0.5289 *** -0.4537 *** -0.0211 Norbert C2_3.H -0.4278 ** 0.118 0.4837 *** -0.1301 -0.0422 Norbert ...... ... ... ... ... Norbert -- Norbert --- Norbert Mag. Norbert NEUWIRTH Norbert Roubiczekgasse 2/23 Norbert A-1100 WIEN Norbert mob: +43 699 1835 0704 Norbert __ Norbert R-help@stat.math.ethz.ch mailing list Norbert https://stat.ethz.ch/mailman/listinfo/r-help Norbert PLEASE do read the posting guide http://www.R-project.org/posting-guide.html Norbert and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expm() within the Matrix package
Gad == Gad Abraham [EMAIL PROTECTED] on Fri, 16 Mar 2007 09:48:52 +1100 writes: Gad Laura Hill wrote: Hi Could anybody give me a bit of advice on some code I'm having trouble with? I've been trying to calculate the loglikelihood of a function iterated over a data of time values and I seem to be experiencing difficulty when I use the function expm(). Here's an example of what I am trying to do y-c(5,10) #vector of 2 survival times p-Matrix(c(1,0),1,2) #1x2 matrix Q-Matrix(c(1,2,3,4),2,2) #2x2 matrix q-Matrix(c(1,2),2,1) #2x1 matrix Loglik-rep(0,length(y))#creating empty vector of same length as y for(i in 1:length(y)){ Loglik[i]-log((p %*% (expm(Q*y[i]))) %*% q) #calculating # Loglikelihood for each y[i] } The code works perfectly if I use exp(Q*y[i]) but not for expm() Gad You need to do a type conversion here, because you get the loglik as a Gad 1x1 Matrix, instead of a scalar: Gad (after running your code) log(p %*% expm(Q * y[i]) %*% q) Gad 1 x 1 Matrix of class dgeMatrix Gad [,1] Gad [1,] 134.5565 Gad If you convert to numeric, you can then assign it to Loglik: Loglik[1] - as.numeric(log(p %*% expm(Q * y[i]) %*% q)) Loglik[1] Gad [1] 134.5565 Hmm, I don't think that's Laura's problem (and actually I don't know what her problem is) : Assignment of a 1 x 1 matrix to a vector is not a problem, and hence the as.numeric(.) above really has no effect : ll - 1:2 (m - matrix(pi, 1,1)) [,1] [1,] 3.141593 ll[1] - m ll [1] 3.141593 2.00 Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expm() within the Matrix package
Gad == Gad Abraham [EMAIL PROTECTED] on Fri, 16 Mar 2007 09:48:52 +1100 writes: Gad Laura Hill wrote: Hi Could anybody give me a bit of advice on some code I'm having trouble with? I've been trying to calculate the loglikelihood of a function iterated over a data of time values and I seem to be experiencing difficulty when I use the function expm(). Here's an example of what I am trying to do y-c(5,10) #vector of 2 survival times p-Matrix(c(1,0),1,2) #1x2 matrix Q-Matrix(c(1,2,3,4),2,2) #2x2 matrix q-Matrix(c(1,2),2,1) #2x1 matrix Loglik-rep(0,length(y))#creating empty vector of same length as y for(i in 1:length(y)){ Loglik[i]-log((p %*% (expm(Q*y[i]))) %*% q) #calculating # Loglikelihood for each y[i] } The code works perfectly if I use exp(Q*y[i]) but not for expm() Gad You need to do a type conversion here, because you get the loglik as a Gad 1x1 Matrix, instead of a scalar: Gad (after running your code) log(p %*% expm(Q * y[i]) %*% q) Gad 1 x 1 Matrix of class dgeMatrix Gad [,1] Gad [1,] 134.5565 Gad If you convert to numeric, you can then assign it to Loglik: Loglik[1] - as.numeric(log(p %*% expm(Q * y[i]) %*% q)) Loglik[1] Gad [1] 134.5565 Gad -- Gad Gad Abraham Gad Department of Mathematics and Statistics Gad The University of Melbourne Gad Parkville 3010, Victoria, Australia Gad email: [EMAIL PROTECTED] Gad web: http://www.ms.unimelb.edu.au/~gabraham Gad __ Gad R-help@stat.math.ethz.ch mailing list Gad https://stat.ethz.ch/mailman/listinfo/r-help Gad PLEASE do read the posting guide http://www.R-project.org/posting-guide.html Gad and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CLUSTER Package
Hi Vallejo, I'm pretty busy currently, and feel your question has much more to do with how to use R more generally than with using the functions from the cluster package. So you may get help from other R-help readers, but maybe only after you have followed the posting-guide and give a reproducible example as you're asked there. Regards, Martin Maechler VallejoR == Vallejo, Roger [EMAIL PROTECTED] on Mon, 12 Mar 2007 10:28:01 -0400 writes: VallejoR Hi Martin, VallejoR In using the Cluster Package, I have results for PAM and DIANA VallejoR clustering algorithms (below part and hier objects): VallejoR part - pam(trout, bestk) VallejoR # PAM results VallejoR hier - diana(trout) VallejoR # DIANA results VallejoR GeneNames - show(RG$genes) VallejoR # Gene Names are in this object VallejoR But I would like also to know what genes (NAMES) are included in each VallejoR cluster. I tried unsuccessfully to send these results to output files VallejoR (clusters with gene Names). This must be an easy task for a good R VallejoR programmer. I will appreciate very much directions or R code on how to VallejoR send the PAM and DIANA results to output files including information on VallejoR genes (Names) per each cluster. VallejoR Thank you very much. VallejoR Roger VallejoR Roger L. Vallejo, Ph.D. VallejoR Computational Biologist Geneticist VallejoR U.S. Department of Agriculture, ARS VallejoR National Center for Cool Cold Water Aquaculture VallejoR 11861 Leetown Road VallejoR Kearneysville, WV 25430 VallejoR Voice:(304) 724-8340 Ext. 2141 VallejoR Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] VallejoR [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Off topic:Spam on R-help increase?
DB == Douglas Bates [EMAIL PROTECTED] on Tue, 6 Mar 2007 11:57:28 -0600 writes: DB On 3/6/07, Bert Gunter [EMAIL PROTECTED] wrote: In the past 2 days I have seen a large increase of spam getting into R-help. Are others experiencing this problem? If so, has there been some change to the spam filters on the R-servers? If not, is the problem on my end? DB There has indeed been an increase in the amount of spam making it DB through to the list. We apologize for the inconvenience. Regretably DB we will not be able to do much about it until the beginning of next DB week. DB Martin Maechler is on vacation at present and I am administering the DB lists until he returns. Most of the time this works even though the DB mail servers are in Zurich Switzerland and I am in Madison, WI, USA. DB However, in the last two days we have had a surge in spam and quite a DB bit of it is getting through the filters. DB The filters are catching some of the spam. I think the main DB difference in the last two days has been that the level of spam to the DB lists has increased but it could be that something has happened to the DB filters too. I've been back today, well relaxed and tanned from the nice vacation; thanks to all of you for taking such an interest in it:-) ;-) With a work back-log of almost 4 weeks, I hadn't dared to look into my R-lists inbox of 2400 messages until about an hour ago. Fortunately it's not the spammers that would have become smarter (well they have or their hired geeks, but already a few months ago, not just now). The main problem has ``just'' been disk-server, then network and file mount problems on the mail server that were unfortunately not seen at first by our IT staff. As a consequence, there had also been enormous ( 24 hours) delays in mail delivery, maybe less visible on the mailing list side of it. As far as I can see/guess now, the spam problem should have lasted only about one to two days --- too long of course for you, but at least not till I had returned to work. Yes indeed, we are sorry for this, but no, we cannot promise it won't happen again :-\ Martin DB All the lists except R-help only allow postings from subscribers so DB there should very little spam on the other lists. DB This subscriber-only policy can be difficult for people like me who DB receive email at one address but send it from another. Either the DB sender must remember to use the account that is registered for the DB list or the list administrator must manually approve the posting. DB Even worse, such a policy dissuades new useRs from posting because DB they get a response that their message has been held pending manual DB approval by the administrator. Sometimes they react by reposting the DB message, then re-reposting, then ... DB We have avoided instituting such a policy on R-help because of the DB level of administrative work that will be involved and our desire not DB to dissuade new useRs from posting to the list. DB However, if this keeps up we may need to reconsider. DB I would ask for the list subscribers to bear with us until Martin DB returns and can check on whether something has gone wrong with the DB filters. DB __ DB R-help@stat.math.ethz.ch mailing list DB https://stat.ethz.ch/mailman/listinfo/r-help DB PLEASE do read the posting guide http://www.R-project.org/posting-guide.html DB and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML and str
DTL == Duncan Temple Lang [EMAIL PROTECTED] on Sat, 10 Feb 2007 07:18:30 -0800 writes: DTL Martin Maechler wrote: Ashley == Ashley Ford [EMAIL PROTECTED] on Wed, 07 Feb 2007 17:18:56 + writes: Ashley If I read in an .xml file eg with xeg - xmlTreeParse(system.file(exampleData, test.xml, package=XML)) Ashley It appears to be OK however examining it with str() gives an apparent Ashley error str(xeg, 2) Ashley List of 2 Ashley $ doc:List of 3 Ashley ..$ file: list() Ashley .. ..- attr(*, class)= chr [1:2] XMLComment XMLNode Ashley ..$ version :List of 4 Ashley .. ..- attr(*, class)= chr XMLNode Ashley ..$ children:Error in obj$children[[...]] : subscript out of bounds Ashley I am unsure if this is a feature or a bug and if the latter whether it Ashley is in XML or str, it is not causing a problem but I would like to Ashley understand what is happening, any ideas ? Yes - thank you for providing a well-reproducible example. After setting options(error = recover) I do obj - xeg$doc mode(obj) # list [1] list is.list(obj) # TRUE [1] TRUE length(obj) # 3 [1] 3 obj[[3]] # --- the error you see above. Error in obj$children[[...]] : subscript out of bounds Enter a frame number, or 0 to exit 1: obj[[3]] 2: `[[.XMLDocumentContent`(obj, 3) Selection: 0 obj$children # works, should be identical to obj[[3]] $comment !--A comment-- $foo foo x=1 element attrib1=my value/ .. This shows that the XML package implements the [[ method wrongly IMHO and also inconsistently with the $ method. From a strict OOP view, the XML author could argue that this is not a bug in XML but rather str() which assumes that x[[length(x)]] works for objects of mode list even when they are not of *class* list, but I hope he would still rather consider changing [[.XMLDocumentContent ... DTL More likely, the appropriate fix is to have DTL length() return the relevant value. Hmm. library(XML) xeg - xmlTreeParse(system.file(exampleData, test.xml, package= XML)) obj - xeg$doc mode(obj) # list [1] list is.list(obj) # TRUE [1] TRUE length(obj) # 3 [1] 3 obj[[3]] # --- the error you see above. Error in obj$children[[...]] : subscript out of bounds names(obj) [1] file version children class(obj) [1] XMLDocumentContent methods(class=class(obj)) [1] xmlApply.XMLDocumentContent* [[.XMLDocumentContent* [3] xmlRoot.XMLDocumentContent* xmlSApply.XMLDocumentContent* XML:::`[[.XMLDocumentContent` function (obj, ...) { obj$children[[...]] } environment: namespace:XML so length(obj) is 3 and obj is a simple S3 object which is just a list with 3 named components, Do you really want to define length(.) to also return the length of obj$children instead of the length() of the list itself? With that you'd have your XMLDocumentContent objects ``look'' like lists with three named components on one hand (and help(xmlTreeParse) does mention these components) but behave in other contexts as if it was just its own component 'obj$children'. Of course you then should also define print.XMLDocumentContent() and str.XMLDocumentContent() accordingly, so users would barely know about the file and version component of 'obj'. But is this really desirable ? With the above [[.XMLDoc... you break the basic S-language premise of [[ and $ to behave accordingly. You could solve everything elegantly if you used S4 instead of S3 classes, since there's no defined correspondence between slot access and [[ (and yes, then (with S4), I'd agree that setMethod(length, XMLDocumentContent, function(x) length([EMAIL PROTECTED])) would be needed too -- and fine. Martin DTL I even recall considering this at the time of writing DTL the package initially. But that was back in 1999/2000 DTL and S4 and R/S-Plus compatibility were not what they DTL are now. It could be changed. Not certain when I will DTL get a chance. Ashley examining components eg str(xeg$doc$children,2) Ashley List of 2 Ashley $ comment: list() Ashley ..- attr(*, class)= chr [1:2] XMLComment XMLNode Ashley etc Ashley is OK. Ashley XML Version 1.4-1, Ashley same behaviour on Windows and Linux, R version 2.4.1 (2006-12-18) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R vs Matlab {R in Industry}
Ravi == Ravi Varadhan [EMAIL PROTECTED] on Thu, 8 Feb 2007 14:39:41 -0500 writes: Ravi Here is a function to create a Toeplitz matrix of any size, and an example Ravi of a 220 x 220 toeplitz matrix, which was created in almost no time: Thanks Ravi, but note two things - ?toeplitz tells you that R already has a fast (R-code-only) toeplitz() function - The point of that benchmark is not to measure how fast you can build a Toeplitz matrix but simply to exercise a double (actually triple) for loop. {and the benchmark R script says so as comment in the code} BTW {not to Ravi, but on the subject}: 1) When comparing this (the for-loop benchmark) --- with Matlab I would want to be sure that Matlab is not simply using an internal short cut since the benchmark is maybe too simplistic: for(i in 1:N) for(j in 1:N) b[i,j] - abs(i - j) {and it maybe interesting to see if R's experimental byte compiler would speed that up} 2) The above is very fast (IMO) and I cannot say why this could be too slow in a realistic situation. 3) The tables I've seen said that Matlab was about a factor of 2 faster for the above loop benchmark. That's scarcely a reason for downgrading (from R to Matlab). Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Timings of function execution in R [was Re: R in Industry]
Ravi == Ravi Varadhan [EMAIL PROTECTED] on Thu, 8 Feb 2007 18:41:38 -0500 writes: Ravi Hi, Ravi greaterOf is indeed an interesting function. It is much faster than the Ravi equivalent R function, pmax, because pmax does a lot of checking for Ravi missing data and for recycling. Tom Lumley suggested a simple function to Ravi replace pmax, without these checks, that is analogous to greaterOf, which I Ravi call fast.pmax. Ravi fast.pmax - function(x,y) {i- xy; x[i]-y[i]; x} Ravi Interestingly, greaterOf is even faster than fast.pmax, although you have to Ravi be dealing with very large vectors (O(10^6)) to see any real difference. Yes. Indeed, I have a file, first version dated from 1992 where I explore the slowness of pmin() and pmax() (in S-plus 3.2 then). I had since added quite a few experiments and versions to that file in the past. As consequence, in the robustbase CRAN package (which is only a bit more than a year old though), there's a file, available as https://svn.r-project.org/R-packages/robustbase/R/Auxiliaries.R with the very simple content {note line 3 !}: - ### Fast versions of pmin() and pmax() for 2 arguments only: ### FIXME: should rather add these to R pmin2 - function(k,x) (x+k - abs(x-k))/2 pmax2 - function(k,x) (x+k + abs(x-k))/2 - {the funny argument name 'k' comes from the use of these to compute Huber's psi() fast : psiHuber - function(x,k) pmin2(k, pmax2(- k, x)) curve(psiHuber(x, 1.35), -3,3, asp = 1) } One point *is* that I think proper function names would be pmin2() and pmax2() since they work with exactly 2 arguments, whereas IIRC the feature to work with '...' is exactly the reason that pmax() and pmin() are so much slower. I've haven't checked if Gabor's pmax2.G - function(x,y) {z - x y; z * (x-y) + y} is even faster than the abs() using one. It may have the advantage of giving *identical* results (to the last bit!) to pmax() which my version does not --- IIRC the only reason I did not follow my own 'FIXME' above. I had then planned to implement pmin2() and pmax2() in C code, trivially, and and hence get identical (to the last bit!) behavior as pmin()/pmax(); but I now tend to think that the proper approach is to code pmin() and pmax() via .Internal() and hence C code ... [Not before DSC and my vacations though!!] Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frame columns in R console
Petr == Petr Pikal [EMAIL PROTECTED] on Fri, 09 Feb 2007 09:42:13 +0100 writes: Petr Hi Petr On 9 Feb 2007 at 10:17, Lauri Nikkinen wrote: Thank you for your answer. When I set options(width=250) I still get the same result when I print the data.frame on my Rgui console (R 2.4.1, Windows XP). Colums become underneath each other. I also get an error (?) message [ reached getOption(max.print) -- omitted 3462 rows ]]. As Petr explains below (and Brian Ripley), you *really* should use different means here --- but I think this is the first time that the relatively new option 'max.print' has hit R-help, hence one other hint, maybe useful to the public: Note that the 'max.print' option was introduced exactly for the purpose of **protecting** the inadvertent user from a flood of output spilling into his console/gui/.. (and apparently locking up R completely, we have even seen crashes when people wanted to print dataframes/matrices/arrays with millions of entries). So, given the above message (yes, not an error), why did you not try to read help(getOption) and look for the word 'max.print' there ? -- if you really really don't want to follow the advice of Brian and Petr, then say something like options(max.print = 1e6) Martin Maechler, ETH Zurich For example if I have a data.frame with 4000 rows and 200 columns I would like to be able to use scroll bars in Rconsole to investigate the whole data.frame. Petr I am not sure if it is the best idea. You shall probably use other Petr means for checking your data frame. Petr Try ?summary, ?str or if you really want to check all values in data Petr frame you can use Petr invisible(edit(test)) Petr to open a spreadsheet like editor. Petr HTH Petr Petr __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frame columns in R console
Lauri == Lauri Nikkinen [EMAIL PROTECTED] on Fri, 9 Feb 2007 14:21:26 +0200 writes: Lauri This still does not solve the issue that when I print in R console I get Lauri columns that don't fit in the window underneath each other. Thanks anyway! But Brian did give you all you needed (even more I'd say) to solve that !?!? Please apologize if I use a bit frank language, but using R, you *really* are expected to read the documentation which is written pretty carefully {probably that's what some people don't like about it and call confusing ??}. Specifically, Brian said BDR 200 columns will take far more than 250 characters. The help says and then pointed you to the docu for options(width = .). I think you need to reread that paragraph, particularly the word 'character' and then you will understand that your original approach of using options(width = 250) can *not* be what you want if your dataframe has 200 columns. Martin Lauri 2007/2/9, Martin Maechler [EMAIL PROTECTED]: Petr == Petr Pikal [EMAIL PROTECTED] on Fri, 09 Feb 2007 09:42:13 +0100 writes: Petr Hi Petr On 9 Feb 2007 at 10:17, Lauri Nikkinen wrote: Thank you for your answer. When I set options(width=250) I still get the same result when I print the data.frame on my Rgui console (R 2.4.1, Windows XP). Colums become underneath each other. I also get an error (?) message [ reached getOption(max.print) -- omitted 3462 rows ]]. As Petr explains below (and Brian Ripley), you *really* should use different means here --- but I think this is the first time that the relatively new option 'max.print' has hit R-help, hence one other hint, maybe useful to the public: Note that the 'max.print' option was introduced exactly for the purpose of **protecting** the inadvertent user from a flood of output spilling into his console/gui/.. (and apparently locking up R completely, we have even seen crashes when people wanted to print dataframes/matrices/arrays with millions of entries). So, given the above message (yes, not an error), why did you not try to read help(getOption) and look for the word 'max.print' there ? -- if you really really don't want to follow the advice of Brian and Petr, then say something like options(max.print = 1e6) Martin Maechler, ETH Zurich For example if I have a data.frame with 4000 rows and 200 columns I would like to be able to use scroll bars in Rconsole to investigate the whole data.frame. Petr I am not sure if it is the best idea. You shall probably use other Petr means for checking your data frame. Petr Try ?summary, ?str or if you really want to check all values in data Petr frame you can use Petr invisible(edit(test)) Petr to open a spreadsheet like editor. Petr HTH Petr Petr Lauri [[alternative HTML version deleted]] Lauri __ Lauri R-help@stat.math.ethz.ch mailing list Lauri https://stat.ethz.ch/mailman/listinfo/r-help Lauri PLEASE do read the posting guide http://www.R-project.org/posting-guide.html Lauri and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Timings of function execution in R [was Re: R in Industry]
TL == Thomas Lumley [EMAIL PROTECTED] on Fri, 9 Feb 2007 08:13:54 -0800 (PST) writes: TL On 2/9/07, Prof Brian Ripley [EMAIL PROTECTED] wrote: The other reason why pmin/pmax are preferable to your functions is that they are fully generic. It is not easy to write C code which takes into account that , [, [- and is.na are all generic. That is not to say that it is not worth having faster restricted alternatives, as indeed we do with rep.int and seq.int. Anything that uses arithmetic is making strong assumptions about the inputs. It ought to be possible to write a fast C version that worked for atomic vectors (logical, integer, real and character), but is there any evidence of profiled real problems where speed is an issue? TL I had an example just last month of an MCMC calculation where profiling showed that pmax(x,0) was taking about 30% of the total time. I used TL function(x) {z - x0; x[z] - 0; x} TL which was significantly faster. I didn't try the TL arithmetic solution. I did - eons ago as mentioned in my message earlier in this thread. I can assure you that those (also mentioned) pmin2 - function(k,x) (x+k - abs(x-k))/2 pmax2 - function(k,x) (x+k + abs(x-k))/2 are faster still, particularly if you hardcode the special case of k=0! {that's how I came about these: pmax(x,0) is also denoted x_+, and x_+ := (x + |x|)/2 x_- := (x - |x|)/2 } TL Also, I didn't check if a solution like this would still TL be faster when both arguments are vectors (but there was TL a recent mailing list thread where someone else did). indeed, and they are faster. Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Timings of function execution in R [was Re: R in Industry]
Duncan == Duncan Murdoch [EMAIL PROTECTED] on Fri, 09 Feb 2007 13:52:25 -0500 writes: Duncan On 2/9/2007 1:33 PM, Prof Brian Ripley wrote: x - rnorm(1) system.time(for(i in 1:1000) pmax(x, 0)) user system elapsed 4.430.054.54 pmax2 - function(k,x) (x+k + abs(x-k))/2 system.time(for(i in 1:1000) pmax2(x, 0)) user system elapsed 0.640.030.67 pm - function(x) {z - x0; x[z] - 0; x} system.time(for(i in 1:1000) pm(x)) user system elapsed 0.590.000.59 system.time(for(i in 1:1000) pmax.int(x, 0)) user system elapsed 0.360.000.36 So at least on one system Thomas' solution is a little faster, but a C-level n-args solution handling NAs is quite a lot faster. Duncan For this special case we can do a lot better using Duncan pospart - function(x) (x + abs(x))/2 Indeed, that's what I meant when I talked about doing the special case 'k = 0' explicitly -- and also what my timings where based on. Thank you Duncan -- and Brian for looking into providing an even faster and more general C-internal version! Martin Duncan The less specialized function Duncan pmax2 - function(x,y) { Duncan diff - x - y Duncan y + (diff + abs(diff))/2 Duncan } Duncan is faster on my system than pm, but not as fast as pospart: system.time(for(i in 1:1000) pm(x)) Duncan [1] 0.77 0.01 0.78 NA NA system.time(for(i in 1:1000) pospart(x)) Duncan [1] 0.27 0.02 0.28 NA NA system.time(for(i in 1:1000) pmax2(x,0)) Duncan [1] 0.47 0.00 0.47 NA NA Duncan Duncan Murdoch On Fri, 9 Feb 2007, Martin Maechler wrote: TL == Thomas Lumley [EMAIL PROTECTED] on Fri, 9 Feb 2007 08:13:54 -0800 (PST) writes: TL On 2/9/07, Prof Brian Ripley [EMAIL PROTECTED] wrote: The other reason why pmin/pmax are preferable to your functions is that they are fully generic. It is not easy to write C code which takes into account that , [, [- and is.na are all generic. That is not to say that it is not worth having faster restricted alternatives, as indeed we do with rep.int and seq.int. Anything that uses arithmetic is making strong assumptions about the inputs. It ought to be possible to write a fast C version that worked for atomic vectors (logical, integer, real and character), but is there any evidence of profiled real problems where speed is an issue? TL I had an example just last month of an MCMC calculation where profiling showed that pmax(x,0) was taking about 30% of the total time. I used TL function(x) {z - x0; x[z] - 0; x} TL which was significantly faster. I didn't try the TL arithmetic solution. I did - eons ago as mentioned in my message earlier in this thread. I can assure you that those (also mentioned) pmin2 - function(k,x) (x+k - abs(x-k))/2 pmax2 - function(k,x) (x+k + abs(x-k))/2 are faster still, particularly if you hardcode the special case of k=0! {that's how I came about these: pmax(x,0) is also denoted x_+, and x_+ := (x + |x|)/2 x_- := (x - |x|)/2 } TL Also, I didn't check if a solution like this would still TL be faster when both arguments are vectors (but there was TL a recent mailing list thread where someone else did). indeed, and they are faster. Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Attachments to R-help postings (Re: Singular Gradient)
DB == Douglas Bates [EMAIL PROTECTED] on Wed, 7 Feb 2007 17:24:40 -0600 writes: DB On 2/7/07, This Wiederkehr [EMAIL PROTECTED] wrote: I tried to fit data with the following function: fit-nls(y~ Is*(1-exp(-l*x))+Iph,start=list(Is=-2e-5,l=2.3,Iph=-0.3 ),control=list(maxiter=500,minFactor=1/1,tol=10e-05),trace=TRUE) But I get only a singular Gradient warning... DB Did you get any trace output at all? It is not clear if you got the DB singular gradient warning before the first iteration completed, which DB means there is a problem at the starting estimates, or after a few DB iterations. Without the data it is difficult to decide. the data can by found attached(there are two sampels of data col 1/2 and 3/4). DB Thanks for offering to include the data. My copy of your message did DB not have the data enclosed. Did you perhaps forget to attach the DB file? More probably he did not attach them with mime type text/plain. Many e-mail clients nowadays attach everything and notably text as unspecified binary (application/octet-stream). For security (and anti-spam) reasons, such attachments are eliminated from postings. I've now slightly modified this content-filtering option for R-help, such that (I think) such e-mails will be *rejected* instead of just the attachment removed -- I'm just trying that now, attaching a 2 line text file Martin I tried to fix it by chanching the start parameters but that didn't solve the problem. Would it be a possibiliti to use the selfstart Model? How? DB Yes. Try SSasymp. I believe that model is equivalent to your model DB but in a different parameterization. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the plotting position of theoretical quantile for qqnorm
LizF == fengfeng [EMAIL PROTECTED] on Wed, 7 Feb 2007 23:08:31 -0600 writes: LizF Hello, LizF I have a doubt about the plotting position of the theoretical quantile for LizF the qqnorm LizF command in R. LizF Let F be the theoretical distribution of Y, we observed a sample of size n, LizF y1,y2, ..., LizF yn. We then sort it and comspare these empirical quantiles to the expected LizF ones LizF from F. For the plotting poition, there are several options: LizF 1. i/(n+1) LizF 2. (i-.375)/(n+.25) LizF 3. (i- .3175)/ (n + .365) LizF etc. yes, particularly etc ;-) LizF Which one is qqnorm used? It's right in front of you if you read help(qqnorm) carefully : See Also: 'ppoints', used by 'qqnorm' to generate approximations to expected order statistics for a normal distribution. So it uses ppoints() and help(ppoints) tells you what's going on: The formula used depends on (n = 10) but see that help page. LizF Thx a lot! You're welcome, Martin Maechler, ETH Zurich LizF Liz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R vs Matlab {Re: R in Industry}
Albr == Albrecht, Dr Stefan (AZ Private Equity Partner) [EMAIL PROTECTED] on Thu, 8 Feb 2007 16:38:18 +0100 writes: Albr Dear all, Albr I was reading with great interest your comments about the use of R in Albr the industry. Personally, I use R as scripting language in the financial Albr industry, not so much for its statistical capabilities (which are Albr great), but more for programming. I once switched from S-Plus to R, Albr because I liked R more, it had a better and easier to use documentation Albr and it is faster (especially with loops). Albr Now some colleagues of mine are (finally) eager to join me in my Albr quantitative efforts, but they feel that they are more at ease with Albr Matlab. I can understand this. Matlab has a real IDE with symbolic Albr debugger, integrated editor and profiling, etc. The help files are Albr great, very comprehensive and coherent. It also could be easier to Albr learn. Albr And, I was very astonished to realise, Matlab is very, very much faster Albr with simple for loops, which would speed up simulations considerably. Can you give some evidence for this statement, please? At the moment, I'd bet that you use forgot to pre-allocate a result array in R and do something like the notorious horrible (:-) 1-dimensional r - NULL for(i in 1:1) { r[i] - verycomplicatedsimulation(i) } instead of the correct r - numeric(1) for(i in 1:1) { r[i] - verycomplicatedsimulation(i) } If r is a matrix or even higher array, and you are using rbind() or cbind() inside the loop to build up the result, the problem will become even worse. Albr So I have trouble to argue for a use of R (which I like) instead of Albr Matlab. The price of Matlab is high, but certainly not prohibitive. R is Albr great and free, but maybe less comfortable to use than Matlab. Albr Finally, after all, I have the impression that in many job offerings in Albr the financial industry R is much less often mentioned than Matlab. Albr I would very much appreciate any comments on my above remarks. I know Albr there has been some discussions of R vs. Matlab on R-help, but these Albr could be somewhat out-dated, since both languages are evolving quite Albr quickly. Albr With many thanks and best regards, Albr Stefan Albrecht __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Search
The official R Search place has been http://search.R-project.org/ for quite a while now. It does mention others including 'rseek' below. BTW: It's main fault for me is that it does not include the R-devel mailing list archives (hint hint :-) Martin Maechler, ETH Zurich IM == İbrahim Mutlay [EMAIL PROTECTED] on Wed, 7 Feb 2007 02:44:24 -0500 writes: IM I know that two of the search engine for R is available: IM http://www.rseek.org/ IM http://www.dangoldstein.com/search_r.html IM -- �brahim Mutlay __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R in Industry
Frank == Frank E Harrell [EMAIL PROTECTED] on Tue, 06 Feb 2007 21:59:45 -0600 writes: Frank Matthew Keller wrote: Bob, Far from flaming you, I think you made a good point - one that I imagine most people who use R have come across. The name R is a big impediment to effective online searches. As a check, I entered R software, SAS software, SPSS software, and S+ software into google. The R 'hit rate' was only ten out of the first 20 results (I didn't look any further). For the other three software packages, the hit rates were all 100% (20/20). I do wonder if anything can/should be done about this. I generally search using the term CRAN but of course, that omits lots of stuff relevant to R. Any ideas about how to do effective online searches for R related materials? I don't think we (the R foundation) will ever change away from R.. Matt Frank I just googled for R and www.r-project.org was the Frank first hit. Don't see a problem at present. We are getting really off-topic, but that's interesting: We all know that Google is helping the Chinese government to censor their own people, so searches there can lead to completely different results. But even here in Zurich Switzerland, I get quite a different hitlist : 1) stat.ethz.ch/~statsoft/stat.programme/R.html [in German] 2) Our local CRAN mirror: stat.ethz.ch/CRAN/ 3) R - (German-language) Wikipedia about letter R: de.wikipedia.org/wiki/R 4) DVD-R - (German-language) Wikipedia de.wikipedia.org/wiki/DVD-R 5) The R Project for Statistical Computing http://www.r-project.org/ So 3/5 are related to R which sounds good, but actually these 3 are all from the first twenty: 3/20. Martin On 2/6/07, Wensui Liu [EMAIL PROTECTED] wrote: I've been looking for job that allows me to use R/S+ since I got out of graduate school 2 years ago but with no success. I am wondering if there is something that can be done to promote the use of R in industry. It's been very frustrating to see people doing statistics using excel/spss and even more frustrating to see people paying $$$ for something much inferior to R. On 2/6/07, Doran, Harold [EMAIL PROTECTED] wrote: The other day, CNN had a story on working at Google. Out of curiosity, I went to the Google employment web site (I'm not looking, but just curious). In perusing their job posts for statisticians, preference is given to those who use R and python. Other languages, S-Plus and something called SAS were listed as lower priorities. When I started using Python, I noted they have a portion of the web site with job postings. CRAN does not have something similar, but think it might be useful. I think R is becoming more widely used in industry and I wonder if helping it move along a bit, the maintainer of CRAN could create a section of the web site devoted to jobs where R is a requirement. Hence, we could have our own little monster.com kind of thing going on. Of the multitude of ways the gospel can be spread, this is small. But, I think every small step forward is good. Anyone think this is useful? Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Frank -- Frank E Harrell Jr Professor and Chair School of Frank Medicine Department of Biostatistics Vanderbilt Frank University Frank __ Frank R-help@stat.math.ethz.ch mailing list Frank https://stat.ethz.ch/mailman/listinfo/r-help PLEASE Frank do read the posting guide Frank http://www.R-project.org/posting-guide.html and Frank provide commented, minimal, self-contained, Frank reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML and str
Ashley == Ashley Ford [EMAIL PROTECTED] on Wed, 07 Feb 2007 17:18:56 + writes: Ashley If I read in an .xml file eg with xeg - xmlTreeParse(system.file(exampleData, test.xml, package=XML)) Ashley It appears to be OK however examining it with str() gives an apparent Ashley error str(xeg, 2) Ashley List of 2 Ashley $ doc:List of 3 Ashley ..$ file: list() Ashley .. ..- attr(*, class)= chr [1:2] XMLComment XMLNode Ashley ..$ version :List of 4 Ashley .. ..- attr(*, class)= chr XMLNode Ashley ..$ children:Error in obj$children[[...]] : subscript out of bounds Ashley I am unsure if this is a feature or a bug and if the latter whether it Ashley is in XML or str, it is not causing a problem but I would like to Ashley understand what is happening, any ideas ? Yes - thank you for providing a well-reproducible example. After setting options(error = recover) I do obj - xeg$doc mode(obj) # list [1] list is.list(obj) # TRUE [1] TRUE length(obj) # 3 [1] 3 obj[[3]] # --- the error you see above. Error in obj$children[[...]] : subscript out of bounds Enter a frame number, or 0 to exit 1: obj[[3]] 2: `[[.XMLDocumentContent`(obj, 3) Selection: 0 obj$children # works, should be identical to obj[[3]] $comment !--A comment-- $foo foo x=1 element attrib1=my value/ .. This shows that the XML package implements the [[ method wrongly IMHO and also inconsistently with the $ method. From a strict OOP view, the XML author could argue that this is not a bug in XML but rather str() which assumes that x[[length(x)]] works for objects of mode list even when they are not of *class* list, but I hope he would still rather consider changing [[.XMLDocumentContent ... Martin Ashley examining components eg str(xeg$doc$children,2) Ashley List of 2 Ashley $ comment: list() Ashley ..- attr(*, class)= chr [1:2] XMLComment XMLNode Ashley etc Ashley is OK. Ashley XML Version 1.4-1, Ashley same behaviour on Windows and Linux, R version 2.4.1 (2006-12-18) Ashley The information contained in this E-Mail and any subsequent Ashley correspondence is private and is intended solely for the intended Ashley recipient(s). The information in this communication may be confidential Ashley and/or legally privileged. Nothing in this e-mail is intended to Ashley conclude a contract on behalf of QinetiQ or make QinetiQ subject to any Ashley other legally binding commitments, unless the e-mail contains an express Ashley statement to the contrary or incorporates a formal Purchase Order. Ashley For those other than the recipient any disclosure, copying, Ashley distribution, or any action taken or omitted to be taken in reliance on Ashley such information is prohibited and may be unlawful. Ashley Emails and other electronic communication with QinetiQ may be monitored Ashley and recorded for business purposes including security, audit and Ashley archival purposes. Any response to this email indicates consent to Ashley this. Ashley Telephone calls to QinetiQ may be monitored or recorded for quality Ashley control, security and other business purposes. Ashley QinetiQ Group plc, Ashley Company Registration No: 4586941, Ashley Registered office: 85 Buckingham Gate, London SW1E 6PD Ashley __ Ashley R-help@stat.math.ethz.ch mailing list Ashley https://stat.ethz.ch/mailman/listinfo/r-help Ashley PLEASE do read the posting guide http://www.R-project.org/posting-guide.html Ashley and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download stock prices
Hi Matthew, Matthew == Matthew Keller [EMAIL PROTECTED] on Sun, 4 Feb 2007 23:06:46 -0500 writes: Matthew Hi Mihai, You might check out the Rmetrics bundle, Matthew available on the cran website. I've used its Matthew fBasics library it's the fBasics *package*; a library is something (actually more than one thing!) different! Matthew to download stock prices. Try the Matthew yahooImport() function and the keystats() function Matthew for downloading specific stock prices. I had to Matthew fiddle with the keystats function to get it to work Matthew properly, but I wrote to the writer of the library you mean the maintainer of the *package* Matthew and it may have been fixed by now. the function is (now) called keystatsImport() and is part of 'fCalendar' -- which is automatically required from package 'fBasics'. help(keystatsImport) contains several examples, unfortunately explicitly not available through example(), but I can successfully execute all of them -- and they do work, including the yahooImport() one. Martin Maechler, ETH Zurich Matthew Best of luck, Matthew Matt Matthew On 2/4/07, Mihai Nica [EMAIL PROTECTED] wrote: gReetings: Is there any way to download a (or a sample of a) crossection of stock market prices? Or is it possible to use get.hist.quote with a *wild card*? Thanks, mihai Mihai Nica 170 East Griffith St. G5 Jackson, MS 39201 601-914-0361 Matthew -- Matthew C Keller Postdoctoral Fellow Virginia Matthew Institute for Psychiatric and Behavioral Genetics __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can't find mailing list commands
Hmm, I thought that learning to read was pretty wide-spread in your country... Richard == Richard Müller [EMAIL PROTECTED] on Wed, 31 Jan 2007 10:01:34 +0100 writes: Richard Sorry for inconvenience - but since I can't find Richard any hints on this list at stat.ethz.ch, I'll ask here: Richard How do I unsubscribe from this list? Richard __ Richard R-help@stat.math.ethz.ch mailing list Richard https://stat.ethz.ch/mailman/listinfo/r-help why do you think this does *not* mention 'unsubscribe' ??? Richard PLEASE do read the posting guide ... yes, indeed. Spamming 3000 readers of R-help with your problem is rather impolite. I apologize for doing the same with my answer, but maybe it prevents similar questsion at least for the very near future.. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] %*% in Matrix objects
Jose == Jose Quesada [EMAIL PROTECTED] on Sat, 27 Jan 2007 23:42:34 +0100 writes: Jose Hi Martin, Thanks for your detailed answer. Jose x - Matrix(1:12, 3,4, sparse = TRUE) I hope that you are aware of the fact that it's not efficient at all to store a dense matrix (it has *no* 0 entry) as a sparse one.. and your posting is indeed an incentive for the Matrix developers to improve that part ... ;-) Jose Yes, the toy example is not sparse but the actual data Jose is, and very large; I'm aware that coercing a dense Jose matrix into the Sparse format is not leading to any Jose saving (on the contrary). I'm talking about a real Jose application with large sparse matrices; from now on, Jose I'll post small examples using sparse matrices as well Jose to avoid confusion. ok. Jose so I tried Jose x = matrix(1:12,3,4) Jose x = as(x, CsparseMatrix) Jose xnorms = sqrt(colSums(x^2)) Jose xnorms = as(xnorms, CsparseMatrix) Jose (xnormed = t(x) * (1/xnorms)) Jose But now, instead of a warning I get Jose Error: Matrices must have same dimensions in t(x) * (1/xnorms) yes. And the same happens with traditional matrices -- and well so: For arithmetic with matrices (traditional or Matrices), A o B (o in {+, *, ^, }) - does require that matrices A and B are ``conformable'', i.e., have exact same dimensions. Only when one of A or B is *not* a matrix, then the usual S-language recycling rules are applied, and that's what you were using in your first example (Matrix * numeric) above. Jose Right. So this means that the * operator is not Jose overloaded in Matrix (that is, if I use it, I'll get Jose my Matrix coherced to matrix. Is that correct? no. The * is overloaded (read on) Jose Does this mean that there is no easy way to do element-by-element Jose multiplication without leaving the sparse Matrix format? No. There is an easy way: If you multiply (or add or ..) two sparse matrices of matching dim(), the result will be sparse. Also if use a scalar (length-1 vector) with a Matrix, the result remains sparse (where appropriate) : (x - Matrix(c(0,1,0,0), 3,3)) 3 x 3 sparse Matrix of class dtCMatrix [1,] . . . [2,] 1 . . [3,] . 1 . Warning message: data length [4] is not a sub-multiple or multiple of the number of rows [3] in matrix (2 * x) + t(x) 3 x 3 sparse Matrix of class dgCMatrix [1,] . 1 . [2,] 2 . 1 [3,] . 2 . ((2 * x) + t(x)) * t(x) 3 x 3 sparse Matrix of class dgCMatrix [1,] . 1 . [2,] . . 1 [3,] . . . What you tried to do, sparse * vector-of-length-gt-1, will only result in a sparse matrix in the next version of the Matrix package. Jose I suspect I'm facing the drop=T as before... why?? Jose Because when I got a row out of a Matrix object, the Jose resulting vector is not of class Matrix but numeric, Jose and then (Matrix * numeric) is applied. Jose Last, I shouldn't consider myself the most standard Jose user of the matrix package, since my lineal algebra is Jose really basic. But in any case, you should know that Jose your package is being enormously useful for me. Keep Jose up the good work. And if I can help by posting my very Jose basic questions, I'm glad to help. Ok, thanks for the flowers :-) Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] %*% in Matrix objects
Jose == Jose Quesada [EMAIL PROTECTED] on Fri, 26 Jan 2007 05:24:12 +0100 writes: Jose Dear R users, Jose I need to normalize a bunch of row vectors. At a certain point I need to divide a matrix by a vector of norms. I find that the behavior of Matrix objects differs from normal matrix objects. I believe you are showing evidence for that; though I know it's still true, and will be less true for the next release of the Matrix package. Jose Example the following code examples differ only in Jose xnormed changing from normal to Matrix object: Jose x = matrix(1:12,3,4) Jose x = as(x, CsparseMatrix) or directly x - Matrix(1:12, 3,4, sparse = TRUE) I hope that you are aware of the fact that it's not efficient at all to store a dense matrix (it has *no* 0 entry) as a sparse one.. Jose xnorms = sqrt(colSums(x^2)) Jose (xnormed = t(x) * (1/xnorms)) Jose This produces a warning: coercing sparse to dense matrix for arithmetic Jose in: t(x) * 1/xnorms. but gets the result (a 4 x 3 matrix) Jose I want to stay in sparse format anyway Jose (if it helps!) what should it help for? Are you talking about a real application with a proper sparse matrix as opposed to the toy example here? In that case I agree, and as a matter of fact, the source code of the Matrix package leading to the above warning is the following } else { ## FIXME: maybe far from optimal: warning(coercing sparse to dense matrix for arithmetic) callGeneric(as(e1, dgeMatrix), e2) } and your posting is indeed an incentive for the Matrix developers to improve that part ... ;-) Jose so I tried Jose x = matrix(1:12,3,4) Jose x = as(x, CsparseMatrix) Jose xnorms = sqrt(colSums(x^2)) Jose xnorms = as(xnorms, CsparseMatrix) Jose (xnormed = t(x) * (1/xnorms)) Jose But now, instead of a warning I get Jose Error: Matrices must have same dimensions in t(x) * (1/xnorms) yes. And the same happens with traditional matrices -- and well so: For arithmetic with matrices (traditional or Matrices), A o B (o in {+, *, ^, }) - does require that matrices A and B are ``conformable'', i.e., have exact same dimensions. Only when one of A or B is *not* a matrix, then the usual S-language recycling rules are applied, and that's what you were using in your first example (Matrix * numeric) above. Jose If I transpose the norms, the error dissapears, but the result is 1 x 4 (not 3 x 4 as before). That would be a bug of *not* giving an error... but I can't see it. Can you please give an exact example -- as you well gave otherwise; and BTW, do you use a sensible sparse matrix such as (x - Matrix(c(0,0,1,0), 3,4)) Jose I suspect I'm facing the drop=T as before... why?? Jose Also, it seems that in normal matrix objects %*% Jose behaves the same as *, not at all!! If that was the case, the inventors of the S language would never have introduced '%*%' ! Jose but in Matrix objects that is not the case. Jose What am I missing? I guess several things, see above, notably how basic matrix+vector - arithmetic is defined to in the S language. Regards, Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrix subsetting {was ... vectorized nested loop...}
Hi Jose, I'm answering your second batch of questions, since Chuck Berry has already well done so with the first one Jose == Jose Quesada [EMAIL PROTECTED] on Tue, 23 Jan 2007 21:46:27 +0100 writes: [] Jose # example Jose library(Matrix) Jose x = as(x,CsparseMatrix) [..] Jose Also, I have noticed that getting a row from a Matrix Jose object produces a normal array (i.e., it does not Jose inherit Matrix class). This is very much on purpose, following the principle of least surprise so I'm surprised you're suprised.. : The 'Matrix' behavior has been modelled to follow the more than 20 years old 'matrix' behavior : matrix(1:9, 3) [,2] [1] 4 5 6 matrix(1:9, 3) [,2 , drop=FALSE] [,1] [1,]4 [2,]5 [3,]6 library(Matrix) Loading required package: lattice Matrix(1:9, 3) [,2] [1] 4 5 6 Matrix(1:9, 3) [,2, drop = FALSE] 3 x 1 Matrix of class dgeMatrix [,1] [1,]4 [2,]5 [3,]6 But then I should not be surprised, because there has been the R FAQ 7.5 Why do my matrices lose dimensions? for quite a while. *And* I think that there is only one thing in the S language about which every knowledgable one agrees that it's a design bug, and that's the fact that 'drop = TRUE' is the default, and not 'drop = FALSE' {but it's not possible to change now, please don't start that discussion!} Given what I say above, I wonder if our (new-style) 'Matrix' objects should not behave differently than (old-style) 'matrix' and indeed do use a default 'drop = FALSE'. This might break some Matrix-based code though, but then 'Matrix' is young enough, and working Matrix indexing is much younger, and there are only about 4 CRAN/Bioconductor packages depending on 'Matrix'. -- This discussion (about changing this behavior in the Matrix package) should definitely be lead on the R-devel mailing list -- CC'ing to R-devel {hence one (but please *only* one !) cross-post} Jose However, selecting 1 rows, Jose does produce a same-class matrix. If I convert with Jose as() the output of selecting one row, am I losing Jose performance? Is there any way to make the resulting Jose vector be a 1-D Matrix object? yes, , drop = FALSE, see above Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.numeric(.1)
NOEL == NOEL Yvonnick [EMAIL PROTECTED] on Wed, 24 Jan 2007 17:29:14 +0100 writes: NOEL Hello, NOEL I noticed the following strange behavior under R-2.4.0 (Linux Mandriva NOEL 2007) : options(OutDec) NOEL $OutDec NOEL [1] . as.numeric(.1) NOEL [1] NA NOEL Warning message: NOEL NAs introduits lors de la conversion automatique as.numeric(,1) NOEL [1] 0,1 Oops ! Should not happen, given your getOption(OutDec) NOEL So I need to use the comma as the decimal separator, NOEL at least as input. Moreover, the last output also use NOEL a comma, though the OutDec option was set to NOEL .. Basic arithmetic ops on the command line work OK NOEL with decimal dots. NOEL I am pretty sure as.numeric(.1) used to work under older versions of NOEL R. Could it be a localization problem ? maybe / probably. We cannot easily reproduce. Instead of the output below, can you please give the full sessionInfo() output? NOEL I would like to use the dot as the decimal separator NOEL both for input and output. Well definitely! (Using , is crazyness in my eyes!!) NOEL Any suggestion ? NOEL Thank you very much in advance, NOEL Yvonnick Noel NOEL Dprt of Psychology NOEL U. of Rennes NOEL France NOEL platform i686-pc-linux-gnu NOEL arch i686 NOEL os linux-gnu NOEL system i686, linux-gnu NOEL status NOEL major 2 NOEL minor 4.0 NOEL year 2006 NOEL month 10 NOEL day03 NOEL svn rev39566 NOEL language R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Robust PCA?
BertG == Bert Gunter [EMAIL PROTECTED] on Thu, 18 Jan 2007 15:28:47 -0800 writes: BertG You seem not to have received a reply. You can use BertG cov.rob in MASS or cov.Mcd in robustbase or BertG undoubtedly others to obtain a robust covariance BertG matrix and then use that for PCA. BertG Bert Gunter Nonclinical Statistics Indeed. Thank you Bert. BTW, (for the archives) do note that their is a R special interest group (=: R-SIG) on robust statistics, and mailing list R-SIG-robust (- https://stat.ethz.ch/mailman/listinfo/r-sig-robust, also for archives) with precisely the goal to foster coordinated programming and porting of robust statistics functionality in R. Expect to see more on this topic there, within the next few days. Martin Maechler, ETH Zurich -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Talbot Katz Sent: Thursday, January 18, 2007 11:44 AM To: r-help@stat.math.ethz.ch Subject: [R] Robust PCA? Hi. I'm checking into robust methods for principal components analysis. There seem to be several floating around. I'm currently focusing my attention on a method of Hubert, Rousseeuw, and Vanden Branden (http://wis.kuleuven.be/stat/Papers/robpca.pdf) mainly because I'm familiar with other work by Rousseeuw and Hubert in robust methodologies. Of course, I'd like to obtain code for this method, or another good robust PCA method, if there's one out there. I haven't noticed the existence on CRAN of a package for robust PCA (the authors of the ROBPCA method do provide MATLAB code). -- TMK -- 212-460-5430 home 917-656-5351 cell __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where can I get the latex format manual?
ronggui == ronggui [EMAIL PROTECTED] on Sat, 20 Jan 2007 00:26:23 +0800 writes: ronggui In www.r-project.org I can find html and pdf format, and the ronggui R-x.x-x/doc/manual has texinfo format only. ronggui I can not find latex format manual. Am I miss something? Thanks for your hints. Hint: Apart from the Reference (all the help pages), there is none, so it's understandable you can't find it.. Texinfo is a (very simple) dialect of TeX, LaTeX is a different (much more sophisticated) dialect. BTW, I still read (and search!) these manuals in the *info* format via Emacs [C-h i ..] which (for me) is faster than anything else.. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with C extension
Markus == Markus Schmidberger [EMAIL PROTECTED] on Fri, 19 Jan 2007 15:02:31 +0100 writes: Markus Hello, Markus I try to write an extension in C, to get a faster functions. Markus Therefore I have to add an element (vector) to a vector. The command in Markus R is very simple: x = c(x,a) aka x - c(x,a) see that's why you'd probably rather stick with R a bit longer, and profile [- help(Rprof)] your code and try to speedup quite a bit before thinking about using C ... Markus But in C I have the problem to reallocate my vector for getting more Markus space. Everything I tried, I get a Segmentation fault. Markus So, how can I combine two vectors in C and give the result back to R Markus (return(x))? and you have a copy of Writing R Extensions right in front of you, electronically at least, I mean? To return the new vector via C's return() you'd definitely need to work with .Call() {which is a good thing but really not for C-beginners}, and that needs a bit time of reading the above manual from cover to cover. Note that you probably should start with (5.8) on .Call. I'm also tending to recommend even starting to peek into R's own C source, notably the header files (src/include/...), and maybe src/main/* in order to see how R objects (SEXPs) are handled, how you add names(), dimnames() etc internally.. Hoping that helps, Martin Maechler, ETH Zurich Markus Thanks Markus Markus Schmidberger Markus -- Markus Dipl.-Tech. Math. Markus Schmidberger __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.