Re: [R] understanding eigen(): getting non-normalized eigenvectors
Eigenvectors are defined only up to a scalar constant (assuming distinct eigenvalues). However, your `by hand' answer does not pass the simple test Av = lambda v for some lambda. So you cannot reproduce incorrect answers in R! Your example is unusual: A is of rank 1. On 9 Jun 2003, Christoph Lehmann wrote: Hi, dear R pros I try to understand eigen(). I have seen, that eigen() gives the eigenvectors normalized to unit length. What shall I do to get the eigenvectors not normalized to unit length? Multiply them by any randomly chosen non-zero scalar! E.g. take the example: A [,1] [,2] V1 0.7714286 -0.2571429 V2 -0.4224490 0.1408163 Calculating eigen(A) by hand gives the eigenvectors (example from Backhaus, multivariate analysis): 0.77143 and 0.25714 -0.42245 0.14082 The second is not an eigenvector of A: try it! They look like rounded versions of A with a sign error. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] binomial GAM ROC
Hello Im a beginner on R and I would like how to develop a ROC statistic to evaluate a GAM model with a binomial distribution (Im using mgcv package) Thanks in advance -- David Nogués Bravo Functional Ecology and Biodiversity Department Pyrenean Institute of Ecology Spanish Research Council Av. Montañana 1005 Zaragoza - CP 50059 976716030 - 976716019 (fax) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] c(...) and methods
I have been writing some S4 classes and have a problem about how I might pass a signature to c(). Take the following example: setClass(collection, representation(list, date=POSIXt)) x - new(collection, list(1,2,3), date=Sys.time()) y - new(collection, list(4,5,6), date=Sys.time()) obviously, I can do c(x,y), but this wil be of class list I would like to do something like: setMethod(c, signature(collection), function(...) { if([EMAIL PROTECTED]@date) stop(at different dates!) res - c(as(e1, list),as(e2, list)) new(collection, res, date = [EMAIL PROTECTED]) }) but c() takes ... as its arguments and I don't know how I might reference that with a signature and appropriate arguments etc. Has anybody an ideas? I presume it doesn't matter that c() is .Primative(c)? Regards, John Marsland ** This is a commercial communication from Commerzbank AG.\ \ This ... {{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] ESRI shapefiles and EMME/2 packages
Hi. Currently I couldn't your uploaded packages in CRAN at Austria. Where could I get them currently? And do you know Rmap(http://www.maths.lancs.ac.uk/Software/Rmap/) which supports shape files and other geo-spatial data formats with map projection? Regards. - Original Message - From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, June 10, 2003 6:57 AM Subject: [R] ESRI shapefiles and EMME/2 packages I just uploaded two packages to CRAN. shapefiles_0.1.tar.gz - functions to read and write ESRI shapefiles (including dbfs) emme2_0.1.tar.gz - functions to read binary data from an EMME/2 databank data (EMME/2 is a transportation modeling program) Please let me know if you find any bugs or have some suggestions. Thanks. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] SOM random seed
First, *which* SOM routine, from which package? Credit where credit is due and all that If you mean the one in class(VR), it uses the standard R random number generation. You don't need to `pass the random-seed as an argument somehow', just set the seed as you would anyway (e.g. via set.seed). For example library(MASS) lcrabs - log(crabs[, 4:8]) crabs.grp - factor(c(B, b, O, o)[rep(1:4, rep(50,4))]) gr - somgrid(topo = hexagonal) old.seed - .Random.seed ## save old seed if you want to set.seed(123) crabs.som - SOM(lcrabs, gr) .Random.seed - old.seed ## restore the seed On Tue, 10 Jun 2003, Jonck van der Kogel wrote: I have a question about the SOM routine. You can either supply the initial representatives for the lattice yourself or else they are chosen randomly from the dataset. Is it possible to pass the random-seed as an argument somehow, when choosing the random initialisation of the lattice? As it is now, each time I run a SOM on a dataset with the same settings the resulting SOM will still be slightly different due to the random initialisation. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Regression output labels
Hello to all- 1. When I run a regression which implements the augmented Dickey-Fuller test, I am confused about the names given to the regressors in the output. I understand what xGE stands for in a standard lm test involving an independent variable GE for instance, but if I lags and or differences are included in the model, what do the following output stand for: xlag(x,-1)GE xD.GE xD.lag(diff(x), -i)GE xD.D.lag(diff(x), -i)GE Thanks for the clarifications -- I don't want to misspeculate on the actual interpretations, here... Hello rwatkins, I assume that you are referring to the function: http://www.econ.uiuc.edu/~econ472/adf.R.txt adf- function(x, L = 2, int = T, trend = T) { #Construct Data for Augmented Dickey Fuller Model with L lags. #This is a modified version for R, in which the command rts was substituted by ts. x - ts(x) D - diff(x) if(L 0) { for(i in 1:L) D - ts.intersect(D, lag(diff(x), - i)) } D - ts.intersect(lag(x, -1), D) if(trend == T) D - ts.intersect(D, time(x)) y - D[, 2] x - D[, -2] if(int == T) summary(lm(y ~ x)) else summary(lm(y ~ x - 1)) } As you can see by the code the lagged differences of the series are produced repetitively in the for-loop; the function argument L sets the maximum number of lagged differences of the time series to be tested for unit root. After the for-loop the one period lag of the original variable is included to D as first columne. In case of a trend inclusion, this variable is included as last columne: xD.lag(x, -1): is the one period lag of the original series; xD.D.D.lag(diff(x), -i): is the one period lag of the differenced series; xD.D.lag(diff(x), -i): is the two period lag of the differenced series and xtime(x): is the linear time trend. Pls. note, that you should include as many lagged differenced x series (function argument L) until the residuals of the test regression do not contain any significant autocorrelations (check their acfs and/or pacfs). Also... 2. When an Engle-Granger test is run on multiple independent variables, only one cointegration vector is returned. Can one tell which vector -- or what two variables' relationship -- is being identified for the R output. Likewise, if I run a Johansen test, does R tell me To which function are you referring in particular? specifically which pairs of variables are cointegrated or do I just get the rank? Thanks to all for your time and consideration. If you apply the Engle-Granger Two-Step-procedure in case of more than two I(1) variables and if more than one cointegration-relationship exists you are estimating a linear combination of the latter. In case you apply a Johansen test (rank- or trace-test) you are only testing the space of cointegrating vectors. In order to draw inferences about the cointegration-relationships you must test these specifically. As a suggestion for reading and further elaboration let me allow you to hint you to the following literature: 1) Hamilton, James D. (1994). Time Series Analysis. Princeton N.J.: Princeton University Press. 2) Engle, Robert F., and Clive W. J. Granger. (1987). Co-Integration and Error Correction: Representation, Estimation and Testing. Econometrica. Vol. 55, pp. 251-276. 3) Johansen, Søren. (1991). Estimation and Hypotheses Testing of Cointegrating Vectors in Gaussian Vector Autoregressive Models. Econometrica. Vol. 59, pp. 1551-1580. 4) Johansen, Søren. (1995). Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford: Oxford University Press. HTH, Bernhard __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- If you have received this e-mail in error or wish to read our e-mail disclaimer statement and monitoring policy, please refer to http://www.drkw.com/disc/email/ or contact the sender. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] fitting data to exponential distribution with glm
On Tuesday 10 June 2003 17:31, Masayoshi Hayashi wrote: (B I am learning glm function, but how do you fit data using exponential (B distribution with glm? (B (BThe Gamma family is parametrised in glm() by two parameters: (Bmean and dispersion; the "dispersion" regulates the shape. (B (BSo must fit a GLM with the Gamma family, and then produce a "summary" (Bwith dispersion parameter set equal to 1, since this value (Bcorresponds to the exponential distribution in the Gamma family. (B (BIn practice: (B (Bfit - glm(formula =..., family = Gamma) (Bsummary(fit,dispersion=1) (B (B (Bbest wishes, (B (BAdelchi Azzalini (B (B-- (BAdelchi Azzalini [EMAIL PROTECTED] (BDipart.Scienze Statistiche, Universit(B?(B di Padova, Italia (Bhttp://azzalini.stat.unipd.it/ (B (B__ (B[EMAIL PROTECTED] mailing list (Bhttps://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] fitting data to exponential distribution with glm
An exponential distribution is a gamma distribution, and as far as fitting the MLE of the coefficients all gammas give the same MLEs. (You can specify the dispersion and hence that the gamma is exponential when asking for summaries, anova, etc.) On Wed, 11 Jun 2003, Masayoshi Hayashi wrote: I am learning glm function, but how do you fit data using exponential distribution with glm? In the help file, under Family Objects for Models, no ready made option seems available for the distribution as well as for other distributions satisfying GLM requirements not listed there. Which ones did you have in mind? The only other commonly used distribution which gives a glm is the negative binomial with fixed shape, for which see the MASS book and package. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] estimate the number of clusters
MM == Martin Maechler [EMAIL PROTECTED] on Tue, 10 Jun 2003 18:12:36 +0200 writes: MM Ping, you found another bug in silhouette.default() -- MM which can happen when there's one cluster with exactly MM one observation. MM I'll let you know more, once I have a complete fix. The patch for this bug {against an *installed* version of cluster} is this : --- --- cluster-version-1.7-2/library/cluster/R/cluster Thu Jun 5 04:00:15 2003 +++ fixed/cluster/R/cluster Tue Jun 10 18:56:17 2003 @@ -2019,11 +2019,11 @@ wds[iC, cluster] - j a.i - if(Nj 1) colSums(dmatrix[iC, iC])/(Nj - 1) else 0 # length(a.i)= Nj ## minimal distances to points in all other clusters: -diC - rbind(apply(dmatrix[!iC, iC], 2, +diC - rbind(apply(dmatrix[!iC, iC, drop = FALSE], 2, function(r) tapply(r, x[!iC], mean)))# (k-1) x Nj minC - max.col(-t(diC)) wds[iC,neighbor] - clid[-j][minC] -b.i - diC[cbind(minC, seq(minC))] +b.i - diC[cbind(minC, seq(along = minC))] s.i - (b.i - a.i) / pmax(b.i, a.i) wds[iC,sil_width] - s.i } --- i.e. you add , drop = FALSE in line 2022 and along =in line 2026 in the appropriate places. A fixed version of cluster should appear soon, and also together with R 1.7.1. Martin Maechler [EMAIL PROTECTED] http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] estimating a density by selecting the bandwidth
I´ve a data set and i want fit a kernel density estimate to the data. but using the k-nearest neighbour method. How i do this with R. thanks -- [EMAIL PROTECTED] -- __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] estimating a density by selecting the bandwidth
On Tue, 10 Jun 2003, Rafael Bertola wrote: I´ve a data set and i want fit a kernel density estimate to the data. but using the k-nearest neighbour method. How i do this with R. Well, you define the exact algorithm you want to use, and then you program it, R being a fully-featured programming language. Your description is exceedingly vague: if you are hoping for an already implemented solution you might take a look at packge locfit (but that is only a rough fit to your `description'. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] color coding a legend
I'm using R 1.6.2 on a Windows 2000 machine. I've plotted the results of an MDS run labeled by a numerical ID, and color coded by a group code: plot(cv.mds.spr$points, type=n, main=Non-Metric Multidimensional Scaling of SprRun CV Watersheds) text(cv.mds.spr$points, labels = as.character(cv.wshed.id.spr), col = codes(cv.wshed.grp), cex=.75) Question is, how do I get the legend to match the color codes? I have tried different permutations of the following: leg.txt - c(LSSJ.NS,LSSJ.SS,US.RD,US.SF) # the groups in cv.wshed.grp legend(-6.5, -2.5, leg.txt, pch=1234, col= as.character(codes(cv.wshed.grp))) But this only plots the codes in cv.wshed.grp as they are encountered, not the levels. Ideally I'd like the legend to have the Group ID label, and a filled box corresponding to the colors in the text call above. -- Rob Schick Ecologist NOAA Fisheries Santa Cruz Lab 110 Shaffer Road Santa Cruz, CA 95060 Phone: 831.420.3960 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Rounding problem R vs Excel
Hi all, I thought that I would follow up on this thread with some refined information. As I mentioned in a prior post in this thread, I posted a query to the OOo folks regarding the inability to replicate the IEEE representation issue in OOo Calc. Recall the following results: OOo Calc 1.0.2 and 1.1 Beta2: Cell Formula Value = 4.145 * 100 + 0.5 4.1500E+02 = 0.5 - 0.4 - 0.1 0.E+00 =(0.5 - 0.4 - 0.1)0.E+00 MS Excel 2002 (XP) Cell Formula Value = 4.145 * 100 + 0.5 4.1500E+02 = 0.5 - 0.4 - 0.1 0.E+00 =(0.5 - 0.4 - 0.1)-2.7755575615628900E-17 Gnumeric 1.0.12: Cell Formula Value = 4.145 * 100 + 0.5 +4.14943157E+02 = 0.5 - 0.4 - 0.1 -2.77555756156289135106E-17 *Gnumeric does not appear to allow the surrounding parens. R 1.7.1 Beta: print(4.145 * 100 + 0.5, digits = 20) [1] 414.94 formatC(4.145 * 100 + 0.5, format = E, digits = 20) [1] 4.14943157E+02 print(0.5 - 0.4 - 0.1, digits = 20) [1] -2.775557561562891e-17 formatC(0.5 - 0.4 - 0.1, format = E, digits = 20) [1] -2.77555756156289135106E-17 Today, I received a reply from Niklas Nebel at Sun to indicate that indeed OOo has included the same (or perhaps more correctly, a similar) optimization that MS incorporated into Excel starting with Excel 97, which renders results close to zero as zero. In OOo Calc 1.1 Beta, Niklas indicated that the relevant source code is in: sal/inc/rtl/math.hxx Since the full source tarball is rather large, a subset of the relevant source code from that file is below. Note the use of the approxEqual() function as the basis for subsequent arithmetic. This code should, in theory, offer some insight into what Excel is doing as well, with the exception of course of the formula parsing issue, which is clearly a bug. HTH, Marc Schwartz *** Copyright: 2002 by Sun Microsystems, Inc. /** Test equality of two values with an accuracy of the magnitude of the given values scaled by 2^-48 (4 bits roundoff stripped). @ATTENTION approxEqual( value!=0.0, 0.0 ) _never_ yields true. */ inline bool approxEqual(double a, double b) { if ( a == b ) return true; double x = a - b; return (x 0.0 ? -x : x) ((a 0.0 ? -a : a) * (1.0 / (16777216.0 * 16777216.0))); } /** Add two values. If signs differ and the absolute values are equal according to approxEqual() the method returns 0.0 instead of calculating the sum. If you wanted to sum up multiple values it would be convenient not to call approxAdd() for each value but instead remember the first value not equal to 0.0, add all other values using normal + operator, and with the result and the remembered value call approxAdd(). */ inline double approxAdd(double a, double b) { if ( ((a 0.0 b 0.0) || (b 0.0 a 0.0)) approxEqual( a, -b ) ) return 0.0; return a + b; } /** Substract two values (a-b). If signs are identical and the values are equal according to approxEqual() the method returns 0.0 instead of calculating the substraction. */ inline double approxSub(double a, double b) { if ( ((a 0.0 b 0.0) || (a 0.0 b 0.0)) approxEqual(a, b) ) return 0.0; return a - b; } /** floor() method taking approxEqual() into account. Use for expected integer values being calculated by double functions. @ATTENTION The threshhold value is 3.55271e-015 */ inline double approxFloor(double a) { double b = floor( a ); // The second approxEqual() is necessary for values that are near the limit // of numbers representable with 4 bits stripped off. (#i12446#) if ( approxEqual( a - 1.0, b ) !approxEqual( a, b ) ) return b + 1.0; return b; } /** ceil() method taking approxEqual() into account. Use for expected integer values being calculated by double functions. @ATTENTION The threshhold value is 3.55271e-015 */ inline double approxCeil(double a) { double b = ceil( a ); // The second approxEqual() is necessary for values that are near the limit // of numbers representable with 4 bits stripped off. (#i12446#) if ( approxEqual( a + 1.0, b ) !approxEqual( a, b ) ) return b - 1.0; return b; } __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Bootstraping with MANOVA
It means what it says! The residuals from the manova fit have a degenerate distribution: that's a problem with bootstrapping. I don't think you've done this correctly: assuming you are intending to bootstrap residuals you seem to have resampled the independent variable and not added back the mean contribution. Compare the examples in MASS4 or Davison Hinkley. Also, do remember you need to show the bootstrap is valid in each scenario: it is not universally valid and this one looks dodgy to me. On Wed, 11 Jun 2003, Ko-Kang Kevin Wang wrote: Hi, Does anyone know what the error message mean? Boot2.Pillai - function(x, ind) { + x - as.matrix(x[,2:ncol(x)]) + boot.x - as.factor(x[ind, 1]) + boot.man - manova(x ~ boot.x) + summary(manova(boot.man))[[4]][[3]] + } man.res - manova(as.matrix(pl.nosite) ~ + as.factor(plankton.new[,1]))$residuals boot2.plank - cbind(plankton.new[, 1], man.res) boot.sep - boot(boot2.plank, Boot2.Pillai, R = 1000, + strata = plankton.new[, 1]) Error in summary.manova(manova(boot.man)) : residuals have rank 5 6 Execution halted A sample of plankton.new is as follows: plankton.new[sample(dim(plankton.new)[1], 5, replace = TRUE),] site ACARTIA EUTERP GLADIO TEMORA FAVELLA OIKOPL 15M 2.326336 3.168792 0.00 0.00 3.854852 0 47W 2.699838 2.276462 1.799341 2.495544 2.274158 0 33W 2.274158 3.301247 0.00 0.00 0.00 0 8 M 2.875640 2.796574 0.00 0.00 3.051538 0 4 M 2.100371 2.796574 0.00 0.00 2.100371 0 pl.nosite is a data frame like plankton.new, but without the site column. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] speeding up 1000s of coxph regression?
I have a gene expression matrix n (genes) X p (cases), where n = 8000 and p = 100. I want to fit each gene as univariate in a coxph model, i.e., fitting 8000 models. I do something like this: res - apply(data, 1, coxph.func) which takes about 4 min, not bad. But I need to do large numbers of permutations of the data (permuting the columns), for example, 2000, which would take 5 days. I would like to know if there is way to speed this up? Any help appreciated. Xiao-Jun __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] speeding up 1000s of coxph regression?
On Tue, 10 Jun 2003, Xiao-Jun Ma wrote: I have a gene expression matrix n (genes) X p (cases), where n = 8000 and p = 100. I want to fit each gene as univariate in a coxph model, i.e., fitting 8000 models. I do something like this: res - apply(data, 1, coxph.func) which takes about 4 min, not bad. But I need to do large numbers of permutations of the data (permuting the columns), for example, 2000, which would take 5 days. I would like to know if there is way to speed this up? Calling coxph.fit directly would likely be faster. Also, you probably don't need to do 2000 permutations on all 8000 genes: a few hundred permutations is probably enough to decide that most of the genes aren't interesting. If you are going to be doing a lot of this sort of thing it might be worth looking at the parallel processing facilities in the `snow' package. There's a description of their use in another gene expression problem in the new R Newsletter. -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Re: color coding a legend - solved?
I did the following, and had it plot as needed: legend(3.25, -2.5, leg.txt, pch=15, col= levels(as.factor(codes(cv.wshed.grp Problem was in my (lack of) understanding of codes() But my second question is this: is there a way to color code the points on the first two PCA axes by group id within a biplot. Fig 11.7 in MASS 4 is my inspiration, but I don't understand how to use the col variable within biplot. The following works within a simple plot, but not a biplot: text(cv.mds.spr$points, labels = as.character(cv.wshed.id.spr), col = levels(as.factor(codes(cv.wshed.grp))), cex=.75) Advice? Robert Schick wrote: I'm using R 1.6.2 on a Windows 2000 machine. I've plotted the results of an MDS run labeled by a numerical ID, and color coded by a group code: plot(cv.mds.spr$points, type=n, main=Non-Metric Multidimensional Scaling of SprRun CV Watersheds) text(cv.mds.spr$points, labels = as.character(cv.wshed.id.spr), col = codes(cv.wshed.grp), cex=.75) Question is, how do I get the legend to match the color codes? I have tried different permutations of the following: leg.txt - c(LSSJ.NS,LSSJ.SS,US.RD,US.SF) # the groups in cv.wshed.grp legend(-6.5, -2.5, leg.txt, pch=1234, col= as.character(codes(cv.wshed.grp))) But this only plots the codes in cv.wshed.grp as they are encountered, not the levels. Ideally I'd like the legend to have the Group ID label, and a filled box corresponding to the colors in the text call above. -- Rob Schick Ecologist NOAA Fisheries Santa Cruz Lab 110 Shaffer Road Santa Cruz, CA 95060 Phone: 831.420.3960 -- Rob Schick Ecologist NOAA Fisheries Santa Cruz Lab 110 Shaffer Road Santa Cruz, CA 95060 Phone: 831.420.3960 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Getting graphs into LaTeX
Hi, A, this is G -- please let me know how to reach you! [[alternate HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Multiple match function?
Hi all, I have (yet another) question about a function in R. What I would like to do is test for the presence of a certain value in a vector, and have the positions that this value is at returned to me. For example, let's say I have a vector: x - c(1,1,2,2,3,3,4,4) Now I would like a function that would return positions 3 and 4 should I test for the value 2. Or 5 and 6 should I test for 3. Could someone please tell me how I should do this? The match function only returns the first position that a value is found at. Of course I could write my own function that loops through the vector and tests for the presence of each value manually but it seems likely that a function that does this is already present in R. No need to re-invent the wheel :-) Thanks very much, Jonck __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Does the RPM for RH9 know about TCL/Tk
Sorry for the probable repeat post but I can only search the list up to 2002 (is there a better way?). I am using the RH9 RPM from CRAN but packages like AnalyzeFMRI say that tcltk is not found. Do I need to do more to get Tk GUIs working on RH9 or does the RPM not have tcltk support built in (should I compile from source). Thanks in advance. Take care. -Morgan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Does the RPM for RH9 know about TCL/Tk
On 06/10/03 18:30, Morgan Hough wrote: Sorry for the probable repeat post but I can only search the list up to 2002 (is there a better way?). Yes, see my search page below. I am using the RH9 RPM from CRAN but packages like AnalyzeFMRI say that tcltk is not found. Do I need to do more to get Tk GUIs working on RH9 or does the RPM not have tcltk support built in (should I compile from source). Thanks in advance. There was in fact some discussion of this last month. I am not sure of the answer. But I installed 1.7.0 from the RPM for RH 9, and I got the same error message when trying to get Rcmdr to work. I did have tcl and tk installed. Unfortunately, I did not do a properly controlled experiment. I first installed tcllib, which was not installed originally. (That didn't help, by itself.) Then I re-installed R _from source_ and then everything worked. But I did have the basic vanilla installation of RH 9, and I did have this problem. So you aren't the only one. I still don't know whether tcllib is necessary, and whether the RPM itself installs different things depending on what is on the system. (I would assume not, but I'm not sure.) -- Jonathan Baron, Professor of Psychology, University of Pennsylvania R page: http://finzi.psych.upenn.edu/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Multiple match function?
On 06/11/03 03:28, Jonck van der Kogel wrote: Hi all, I have (yet another) question about a function in R. What I would like to do is test for the presence of a certain value in a vector, and have the positions that this value is at returned to me. For example, let's say I have a vector: x - c(1,1,2,2,3,3,4,4) Now I would like a function that would return positions 3 and 4 should I test for the value 2. Or 5 and 6 should I test for 3. one way is which(x==2) I'm sure there are others. -- Jonathan Baron, Professor of Psychology, University of Pennsylvania R page: http://finzi.psych.upenn.edu/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] qwilcox
The function 'wilcox.test' in R and S gives (almost) identical results (see below). 'qwilcox' however, does not: qwilcox(p,5,5) p: 0.025 0.975 R 3 22 S 18 37 I originally wanted to ask a questions, but then I found the answer. Given the confusion I run into, I wonder if this experience is worth reporting. The S-Plus quantiles are almost correct (they are the limits of the region of acceptance, rather than the quantiles). The description in the R help file Distribution of the Wilcoxon Rank Sum Statistic suggests that R:qwilcox also gives quantiles for the rank sum (which the Wilcoxon rank sum test is based on). In fact, however, it gives quantiles for the u-statistic (which the Mann-Whitney test is based upon). While the tests are logically equivalent, the particular test statistics - sum(Xic(X,Y))rank sum (Wilcoxon) - sum(Xic( Y))u statistic (Mann-Whitney) are different (apologies for the non-standard notation). Since wilcox.test relates to the rank sums in both R and S, as does qwilcox in S, the name 'qwilcox' in R may be misleading. How about renaming it to 'qmannwhitney' instead and adding 'qwilcoxon' for a function that corresponds to S:qwilcox? x1 - c(1,2,3, 5,6 ) x2 - c( 4,7,8,9,10) sum(x1) [1] 17 sum(x2) [1] 38 R wilcox.test(x1,x2,alternative=two.sided) Wilcoxon rank sum test: p-value = 0.03175 R wilcox.exact(x1,x2,alternative=two.sided) Exact Wilcoxon rank sum test: p-value = 0.03175 S wilcox.test(x1,x2,alternative=two.sided) Exact Wilcoxon rank-sum test: p-value = 0.0317 x1 - c(1,2, 4,5,6 ) x2 - c(3, 7,8,9,10) sum(x1) [1] 18 sum(x2) [1] 37 R wilcox.test(x1,x2,alternative=two.sided) Wilcoxon rank sum test: p-value = 0.05556 R wilcox.exact(x1,x2,alternative=two.sided) Exact Wilcoxon rank sum test: p-value = 0.05556 S wilcox.test(x1,x2,alternative=two.sided) Exact Wilcoxon rank sum test: p-value = 0.0556 Knut M. Wittkowski, PhD,DSc -- The Rockefeller University, GCRC 1230 York Ave #121B, Box 322, NY,NY 10021 +1(212)327-7175, +1(212)327-8450 (Fax) [EMAIL PROTECTED] http://www.rucares.org/statist/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] speeding up 1000s of coxph regression?
Thomas Lumley [EMAIL PROTECTED] writes: If you are going to be doing a lot of this sort of thing it might be worth looking at the parallel processing facilities in the `snow' package. There's a description of their use in another gene expression problem in the new R Newsletter. Actually, they used the RPVM package directly; however, Thomas is still correct, it probably would be simple to recast using SNOW. Some hints and details can be found in a tech report by Luke Tierney, Michael Li, and myself in the UW Biostat tech report series (can't recall which #, but it's on http://www.bepress.com/uwbiostat/). best, -tony -- A.J. Rossini / [EMAIL PROTECTED] / [EMAIL PROTECTED] Biomedical/Health Informatics and Biostatistics, University of Washington. Biostatistics, HVTN/SCHARP, Fred Hutchinson Cancer Research Center. FHCRC: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachments ... {{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help