Re: [R] PCA IN R
prcomp() in stats handles matrices with n p well, IMO. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xeon processor and ATLAS
Jeffrey J. Hallman wrote: I've been doing econometrics for nearly 20 years, and have not yet run across a situation that called for looking at a 1000 x 1000 matrix. I tend not to believe analyses with more than a dozen explanatory variables. In NIR spectroscopy, it is common to have at least 1000 variables, and in NMR or MS, you easily get spectra with 2 variables. You seldom get 1000 spectra, though. :-) 10 to 100 is more common. -- B/H __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Streamlining Prcomp Data
Try this result - summary(prcomp(USArrests)) names(result) M - result$importance M[2,] The labels are the dimnames of the importance matrix. They only show up when the matrix is printed. If you wish, you can remove them with dimnames(M) - NULL. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about R, RMSEP, R2, PCR
Nitish Kumar Mishra wrote: I want to calculate PLS package in R. Now I want to calculate R, MSEP, RMSEP and R2 of PLSR and PCR using this. I also add this in library of R. How I can calculate R, MSEP, RMSEP and R2 of PLSR and PCR in R. I s any other method then please also suggest me. Simply I want to calculate these value. I'm not entirely sure what you are asking about, but if you want to calculate R, MSEP, RMSEP and R2 for PLSRs and PCRs with the pls package, this should work: library(pls) data(yarn) mymodel - plsr(density ~ NIR, ncomp = 10, data = yarn) # or pcr() See ?plsr for further options, especially validation for using cross-validation. ## R2: R2(mymodel) ## MSEP: MSEP(mymodel) ## RMSEP: RMSEP(mymodel) See ?R2, etc. for further arguments, especially estimate for selecting the estimator (test set, CV, or train). The objects returned by these functions have a plot method, som plot(RMSEP(...)) does what you'd expect. See ?R2 for details about the returned objects. To get R (I presume you mean the correlation between predicted and measured values), you can use sqrt(R2(mymodel)$val) -- HTH, Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-About PLSR
Nitish Kumar Mishra wrote: I have installed PLS package in R and use it for princomp prcomp commands for calculating PCA using its example file(USArrests example). Uhm. These functions and data sets are not in the pls package; they are in the stats and datasets packages that come with R. But How I can use PLS for Partial least square, R square, mvrCv one more think how i can import external file in R. When I use plsr, R2, RMSEP it show error could not find function plsr, RMSEP etc. How I can calculate PLS, R2, RMSEP, PCR, MVR using pls package in R. There is an Rnews article describing the package¹, and a paper in Journal of Statistical Software². ¹Mevik, B.-H. (2006); The pls package; R News 6(3), 12-17. http://cran.r-project.org/doc/Rnews ²Mevik, B.-H., Wehrens, R. (2007); The pls Package: Principal Component and Partial Least Squares Regression in R; Journal of Statistical Software 18(2), 1--24. http://www.jstatsoft.org/v18/i02/v18i02.pdf -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About PLR
From within R, you can give the command install.packages(pls) and R will download and install it for you (as long as you have access to the Internet). To install an already downloaded package, you can use R CMD INSTALL pls_2.0-0.tar.gz in a terminal window. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PCA (prcomp) details info.
Francesco Savorani wrote: I'm handling a matrix dataset composed by a number of variables much higher than the objects (900 vs 100) and performing a prcomp (centered and scaled) PCA on it. What I get is a Loadings (rotation) matrix limited by my lower number of objects and thus 900x100 instead of 900x900. If I try to manually calculate the matrix scores multiplying the original variables (centered and scaled) for such a loadings matrix I cannot obtain the same values calculated by R and stored on the prcomp$x matrix (100x100). This works for me: M - matrix(rnorm(900*100), ncol = 900) pca - prcomp(M, scale = TRUE) S - scale(M) %*% pca$rotation all.equal(S, pca$x) ## = TRUE -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] pls version 2.0-0
Version 2.0-0 of the pls package is now available on CRAN. The pls package implements partial least squares regression (PLSR) and principal component regression (PCR). Features of the package include - Several plsr algorithms: orthogonal scores, kernel pls and simpls - Flexible cross-validation - A formula interface, with traditional methods like predict, coef, plot and summary - Functions for extraction of scores and loadings, and calculation of (R)MSEP and R2 - Functions for plotting predictions, validation statistics, coefficients, scores, loadings, biplots and correlation loadings. The main changes since 1.2-0 are - There is now an options mechanism for selecting default fit algorithms. See ?pls.options. - loadingplot() and coefplot() now try to be more intelligent when plotting x axis labels. - The handling of factors in X has been improved, by changing the way the intercept is removed from the model matrix. - All PLSR and PCR algorithms, as well as mvrCv(), have been optimised. Depending on the algorithm used, the size of the matrices, and the number of components used, one can expect from 5% to 65% reduction in computation time. - Scaling of scores and loadings of kernel PLS and svd PCR algorithm has changed. They are now scaled using the `classic' scaling found in oscorespls. - The arguments `ncomp' now always means number of components, and `comps' always means component number. The argument `cumulative' has been removed. - A new data set 'gasoline' has been included. - The 'NIR' and 'sensory' data sets have been renamed to 'yarn' and 'oliveoil'. See the file CHANGES in the sources for all changes. -- Bjørn-Helge Mevik ___ R-packages mailing list R-packages@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why the factor levels returned by cut() are not ordered?
Wolfram Fischer wrote: What is the reason, that the levels of the factor returned by cut() are not marked as ordered levels? I don't know, but you can always make it ordered with ordered(cut(breaks = 3, sample(10))) help(factor) ... If 'ordered' is 'TRUE', the factor levels are assumed to be ordered. ... The help file for factor() probably doesn't tell you much about how cut() works. :-) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Look for a R Package to locate peak data point in one dimensional data set
I think the Bioconductor package PROcess has functions for that. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can lm() automatically take in the independent variables without knowing the names in advance
HelponR wrote: I am trying to use lm to do a simple regression but on a batch of different files. Each file has different column names. I know the first column is the dependent variable and all the rest are explanatory variables. I believe lm(data = thedataframe) (i.e. with no formula!) will use the first column as response and the rest as predictors. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A statement over multiple lines (i.e. the ... feature in Matlab)
Robin Hankin wrote: For the line breaking, R deals with incomplete lines by not executing the statement until you finish it. Beware, however, that syntactically valid lines do get executed immediately (at least at the prompt). So 1 + 2 - 3 will be interpreted as two commands (returning 3 and -3, respectively), while 1 + 2 - 3 will be interpreted as a single command (returnig 0). -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New package ffmanova for 50-50 MANOVA released
Version 0.1-0 of a new package `ffmanova' is now available on CRAN. Comments, suggestions, etc. are welcome. Please use the email address ffmanova (at) mevik.net. The package implements 50-50 MANOVA (Langsrud, 2002) with p-value adjustment based on rotation testing (Langsrud, 2005). The 50-50 MANOVA method is a modified variant of classical MANOVA made to handle several highly correlated responses. Classical MANOVA performs poorly in such cases and it collapses when the number of responses exceeds the number of observations. The 50-50 MANOVA method is suggested as a general method that will handle all types of data. Principal component analysis is an integrated part of the algorithm. The single response special case is ordinary general linear modeling. Type II sums of squares are used to handle unbalanced designs (Langsrud, 2003). Furthermore, the Type II philosophy is extended to continuous design variables. This means that the method is invariant to scale changes. Centering of design variables is not needed. The Type II approach ensures that common pitfalls are avoided. A univariate F-test p-value for each response can be reported when several responses are present. However, with a large number of response variables, these results are questionable since we will expect a lot of type I errors (incorrect significance). Therefore the p-values need to be adjusted. By using rotation testing it is possible to adjust the single response p-values according to the familywise error rate criterion in an exact and non-conservative (unlike Bonferroni) way. It is also possible to adjust p-values according to a false discovery rate criterion. Our method is based on rotation testing and allows any kind of dependence among the responses (Moen et al., 2005). Note that rotation testing is closely related to permutation testing. One difference is that rotation testing relies on the multinormal assumption. All the classical tests (t-test, F-test, Hotelling T^2 test, ...) can be viewed as special cases of rotation testing. REFERENCES Langsrud, Ø. (2002), 50-50 Multivariate Analysis of Variance for Collinear Responses, Journal of the Royal Statistical Society SERIES D - The Statistician, 51, 305-317. Langsrud, Ø. (2003), ANOVA for Unbalanced Data: Use Type II Instead of Type III Sums of Squares, Statistics and Computing, 13, 163-167. Langsrud, Ø. (2005), Rotation Tests, Statistics and Computing, 15, 53-60. Moen, B., Oust, A., Langsrud, Ø., Dorrell, N., Gemma, L., Marsden, G.L., Hinds, J., Kohler, A., Wren, B.W. and Rudi, K. (2005), An explorative multifactor approach for investigating global survival mechanisms of Campylobacter jejuni under environmental conditions, Applied and Environmental Microbiology, 71, 2086-2094. -- Bjørn-Helge Mevik and Øyvind Langsrud ___ R-packages mailing list R-packages@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding Grid lines
Perhaps ?grid will help you. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] princomp and eigen
Murray Jorgensen wrote: set.seed(160706) X - matrix(rnorm(40),nrow=10,ncol=4) Xpc - princomp(X,cor=FALSE) summary(Xpc,loadings=TRUE, cutoff=0) Importance of components: Comp.1Comp.2Comp.3 Comp.4 Standard deviation 1.2268300 0.9690865 0.7918504 0.55295970 [...] I would have expected the princomp component standard deviations to be the square roots of the eigen() $values and they clearly are not. It's an 1/n vs. 1/(n-1) thing: eX - eigen(var(X)) sqrt(eX$values) [1] 1.2931924 1.0215069 0.8346836 0.5828707 sqrt(9/10 * eX$values) [1] 1.2268300 0.9690865 0.7918504 0.5529597 -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with code/documentation mismatch
Just an idea: how about using the \usage for the formal syntax, and \synopsis for the user syntax, i.e. x/y ? Not sure it will work, but it might be worth a try... :-) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R ``literal'' comand
capture.output(...) If you want a single string, with newlines: paste(capture.output(...), collapse = \n) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Status of data.table package
Liaw, Andy wrote: I don't see it on CRAN, either, nor could I find mention of it in the R News you cited. p. 66 -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] PLS
jivan parab wrote: can u please give me the code written in matlab for partial least square regression This is an email list about R, not about Matlab. will you pls provide me the code written in matlab and also the exlanatiion to each step You not only want someone to do the work for you, but explain what they did as well? :-) My suggestion is: read a good introduction to PLSR (for instance the one found in Martens Næs (1989) Multivariate Calibration, Wiley). If you are not able to implement the algorithm from the description there, you are probably better off using an existing implementation, for instance Barry Wise's chemometric toolbox. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Keeping scientific format on assignment
Joe Byers wrote: For example, the box.test object has a p-value of 2e-14 when I do a-box.test.object$p-value; a; the value of a is 0 not 2e-14. The _value_ is still 2e-14 (up to machine precision). How do I keep the precision and format of the p-value. You can format p-values in different ways using format.pval(), which returns a string with the formatted value. E.g., format.pval(2e-14) [1] 2e-14 -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [R-pkgs] pls package: bugfix release 1.2-1
Version 1.2-1 of the pls package is now available on CRAN. This is mainly a bugfix-release. If you fit multi-response models, you are strongly engouraged to upgrade! The main changes since 1.2-0 are - Fixed bug in kernelpls.fit() that resulted in incorrect results when fitting mulitresponse models with fewer responses than predictors - Changed default radii in corrplot() - It is now possible to select the radii of the circles in corrplot See the file CHANGES in the sources for all changes. The pls package implements partial least squares regression (PLSR) and principal component regression (PCR). Features of the package include - Several plsr algorithms: orthogonal scores, kernel pls and simpls - Flexible cross-validation - A formula interface, with traditional methods like predict, coef, plot and summary - Functions for extraction of scores and loadings, and calculation of (R)MSEP and R2 - A simple multiplicative scatter correction (msc) implementation - Functions for plotting predictions, validation statistics, coefficients, scores, loadings, biplots and correlation loadings. -- Bjørn-Helge Mevik ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question about PLS regression
Andris Jankevics wrote: But I have a sligthy different results with my real data. Where can the problem be? I think you have to supply some details for anyone to be able to answer. At least what you did (the code), what you got (the results) and what you expected to get. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] question about Principal Component Analysis in R?
Michael wrote: pca=prcomp(training_data, center=TRUE, scale=FALSE, retx=TRUE); Then I want to rotate the test data set using the d1=scale(test_data, center=TRUE, scale=FALSE) %*% pca$rotation; d2=predict(pca, test_data, center=TRUE, scale=FALSE); these two values are different min(d2-d1) [1] -1.976152 max(d2-d1) [1] 1.535222 This is because you have subtracted a different means vector. You should use the coloumn means of the training data (as predict does; see the last line of stats:::predict.prcomp): d1=scale(test_data, center=pca$center, scale=FALSE) %*% pca$rotation; -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [R-pkgs] pls version 1.2-0
Version 1.2-0 of the pls package is now available on CRAN. The pls package implements partial least squares regression (PLSR) and principal component regression (PCR). Features of the package include - Several plsr algorithms: orthogonal scores, kernel pls and simpls - Flexible cross-validation - A formula interface, with traditional methods like predict, coef, plot and summary - Functions for extraction of scores and loadings, and calculation of (R)MSEP and R2 - A simple multiplicative scatter correction (msc) implementation - Functions for plotting predictions, validation statistics, coefficients, scores, loadings, biplots and correlation loadings. The main changes since 1.1-0 are - predict() now handles missing data like the `lm' method does (the default is to predict `NA'). - fitted() and residuals() now return NA for observations with missing values, if na.action is na.exclude. - `ncomp' is now reduced when it is too large for the requested cross-validation. - Line plot parameter arguments have been added to predplotXy(), so one can control the properties of the target line in predplot(). - MSEP(), RMSEP(), loadings(), loadingplot() and scoreplot() are now generic. See the file CHANGES in the sources for all changes. -- Ron Wehrens and Bjørn-Helge Mevik ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] difference between rnorm(1000, 0, 1) and running rnorm(500, 0, 1) twice
Why don't you test it yourself? E.g., set.seed(42) bob1 - rnorm(1000,0,1) set.seed(42) bob2 - rnorm(500,0,1) bob3 - rnorm(500,0,1) identical(bob1, c(bob2, bob3)) I won't tell you the answer. :-) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Using R to process spectroscopic data
Dirk De Becker wrote: * Determine the range of the spectrum to be used - For this, I should be able to calculate the regression coefficients You can get the regression coefficients from a PLSR/PCR with the coef() function. See ?coef.mvr However, using the regression coefficients alone for selecting variables/regions, can be 'dangerous' because the variables are highly correlated. One alternative is 'variable importance' measures, e.g. VIP (variable importance in projections) as described in Chong, Il-Gyo Jun, Chi-Hyuck, 2005, Performance of some variable selection methods when multicollinearity is present, Chemometrics and Intelligent Laboratory Systems 78, 103--112. A crude implementation of VIP can be found in http://mevik.net/work/software/pls.html Another alternative is to use jackknife-estimated uncertainties of the regression coefficients in significance tests. (I don't have any reference or implementation, sorry. :-) The correlation loadings can also give valuable information about which variables that might be important for the regression. See ?corrplot in the pls package. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 'all' inconsistent?
Seth Falcon wrote: On 29 Jan 2006, [EMAIL PROTECTED] wrote: On Sun, 29 Jan 2006, Elizabeth Purdom wrote: I came across the following behavior, which seems illogical to me. What did you expect and why? I don't know if it is a bug or if I'm missing something: all(logical(0)) [1] TRUE All the values are true, all none of them. I thought all the values are false, all none of them, because there aren't any that are true: any(logical(0)) [1] FALSE But they are, all none of them: all(!logical(0)) [1] TRUE :-) And there aren't any FALSE values either: any(!logical(0)) [1] FALSE so it is only logical that all none of them are TRUE. I love the empty set! :-) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [R-pkgs] New package lspls
Dear R useRs, A new package `lspls' is now available on CRAN. It implements the LS-PLS (least squares--partial least squares) regression method, described in for instance Jørgensen, K., Segtnan, V. H., Thyholt, K., Næs, T. (2004) A Comparison of Methods for Analysing Regression Models with Both Spectral and Designed Variables; Journal of Chemometrics 18(10), 451--464. The current version of the package (0.1-0) should probably be considered `alpha software'. Nothing is cast in iron yet, and especially the formula interface and internal structure are apt to change in future versions. The software should, however, be fully usable in its present form. `lspls' currently includes fit and cross-validation functions with formula interfaces, a predict method, and plots of scores, loadings and (R)MSEP values. Suggestions, bug reports and other comments are very welcome. -- Sincerely, Bjørn-Helge Mevik ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] modifying code in contributed libraries - changes from versions 1.* to 2.*
Seth Falcon wrote: Actually, R source packages are also mangled. While the source is readable, it is not in the form used to develop the package. I haven't seen this behaviour. At least for the simple package I'm maintaining (pls), the only file in the source package that is changed by R CMD build, is DESCRIPTION. All .R and .Rd files are untouched (even the modification dates are unchanged). (This is on a Linux system, I don't know how it works on MS/Mac.) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] changing figure size in Sweave
TEMPL Matthias wrote: Use \setkeys{Gin} to modify figure sizes or use explicit \includegraphics commands in combination with Sweave option include=FALSE. Or use \documentclass[nogin,...]{...}. Then the 'Gin' will have no effect, and the size of the plots in the document will not be changed from the size given as ...,height=??,width=?? (i.e. the size produced by R). -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] changing figure size in Sweave
Bjørn-Helge Mevik wrote: Or use \documentclass[nogin,...]{...}. Then the 'Gin' will have no effect, and the size of the plots in the document will not be changed from the size given as ...,height=??,width=?? (i.e. the size produced by R). A small correction: using the 'nogin' doesn't make LaTeX ignore settings of 'Gin', but prevents Sweave.sty from setting it. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Insightful Announces: R and S-PLUS- Panel Discussion at 9th Annual 2005 User Conference
Michael O'Connell wrote: tools to make it easy to convert R packages to S-PLUS. Not the other way around as well? -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ISO R-programming docs/refs
[A lot of polite and constructive critique deleted] Is my impression correct that R is simply not well-documented enough for serious programming? No. Have I missed a key reference to programming R? Yes. How about reading the text that R displays when it starts (and follow its suggestions)? Or visiting the canonical web site for R (http://www.r-project.org/)? Or consulting question 2.7 in the R FAQ (2.7 What documentation exists for R?) Or reading the posting guide for the list (http://www.R-project.org/posting-guide.html)? All four methods would (presumably) quickly have led you to manuals such as the R Language Definition, Writing R Extensions and R Data Import/Export, and given references to books on programming S and/or R. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [R-pkgs] pls version 1.1-0
Version 1.1-0 of the pls package is now available on CRAN. The pls package implements partial least squares regression (PLSR) and principal component regression (PCR). Features of the package include - Several plsr algorithms: orthogonal scores, kernel pls and simpls - Flexible cross-validation - A formula interface, with traditional methods like predict, coef, plot and summary - Functions for extraction of scores and loadings, and calculation of (R)MSEP and R2 - A simple multiplicative scatter correction (msc) implementation - Functions for plotting predictions, validation statistics, coefficients, scores, loadings, biplots and correlation loadings. The main changes since 1.0-3 are - mvr, mvrCv and predict.mvr now has builtin support for scaling of X. - A new function stdize for explicit centering and/or scaling. - Correlation loadings plot (corrplot). - New argument `varnames' in coefplot, to label the x tick marks with the variable names. - loadingplot, coefplot and plot.mvrVal can now display legends, with the argument 'legendpos'. See CHANGES in the sources for all changes. -- Bjørn-Helge Mevik and Ron Wehrens ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help: PLSR
Shengzhe Wu writes: I have a data set with 15 variables (first one is the response) and 1200 observations. Now I use pls package to do the plsr as below. [...] Because the trainSet has been scaled before training, I think Xtotvar should be equal to 14, but unexpectedly Xtotvar = 16562, Because the Xtotvar is the total X variation, measured by sum(X^2) (where X has been centered). With 14 variables, scaled to sd == 1, and 1200 observations, you should get Xtotvar == 14*(1200-1) == 16786. (Maybe you have 1184 observations: 14*1183 == 16562.) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Reference manual is not available in the help menu of the rgui
Sean O'Riordain writes: Actually, I've started reading the reference manual... :-) I printed it out 2-to-a-page and I'm working my way through it, Ah! This reminds me of the `good old days', reading the Emacs manual, Emacs lisp manual, Gnu C library manual, The payoff came in the section giving the meaning of the C library error codes: EGREGIOUS means `You did *what*?'. :-) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] PLSR: model notation and reliabilities
I.Ioannou writes: I have a model with 2 latent constructs (D1 and D2) each one made by 3 indicators (D1a, D1b, D1c etc). Also I have 2 moderating indicators (factors, m1, m2). The response (Y) is also a latent construct, with 3 indicators (Y1,Y2,Y3). [...] It seems to me that what you are looking for, is some sort of structured equation models (à la Lisrel). The pls package implements partial least squares regression and principal component regression, which is something different. I quess you could still use plsr for the outer model (path model), but you would have to build the inner model (the constructs) with other tools, such as prcomp/princomp or other factor analyses (see e.g. ?factanal and ?varimax). Alternatively, there is an R package sem that implements structured equation models. You might want to take a look at that. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help: pls package
wu sz writes: trainSet = as.data.frame(scale(trainSet, center = T, scale = T)) trainSet.plsr = mvr(formula, ncomp = 14, data = trainSet, method = kernelpls, CV = TRUE, validation = LOO, model = TRUE, x = TRUE, y = TRUE) [Two side notes here: 1) scaling of the data (with its sd) should be performed inside the cross-validation. In the current version of 'pls', one can use cvplsr - crossval(plsr(y ~ scale(X), ncomp = 14, data = mydata), length.seg = 1) (However, 'crossval' is slower than the built-in cross-validation on 'mvr'/'plsr'. In the development version of the package, scaling within the cross-validation has been implemented in the built-in cross-validation. This will hopefully be published shortly.) 2) The 'CV' argument is from the earlier 'pls.pcr' package, and is no longer used. It is silently ignored.] i = 1; msep_element = c() while(i = length(p)){ msep_element[,i] = (p[i]-y)^2 i = i + 1 } Hmm... I don't see how you got that code to run. This should work, though: msep_element - (p - y)^2 msep = colMeans(msep_element) msep_sd = sd(msep_element) You will get much closer to the true value with sd(msep_element) / sqrt(length(y)) However, this will not produce an unbiased estimate of the sd of the estimated MSEP, because it ignores the depencies between the residuals. E.g., the residual when sample 1 is predicted is not independent of the residual when sample 2 is predicted. In general, I think, it will produce underestimated sds. The effect should be largest for small data sets. This is the reason the pls package currently doesn't estimate se of cross-validated MSEPs. There is also the question of what the estimated should be conditioned on: for leave-one-out cross-validation, sd(MSEP | trainData) = 0. [If someone knows how to calculate unbiased estimates of cross-validated MSEPs, please let me know. :-)] -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] PLS: problem transforming scores to variable space
rainer grohmann writes: However, when I try to map the scores back to variable space, I ran into problems: [...] cbind(t$scores[,1],(t$scores%*%(t$loadings)%*%t$projection)[,1]) You need to transpose the loadings: all.equal(unclass(t$scores), + t$scores %*% t(t$loadings) %*% t$projection) [1] TRUE (A tip: Since 't' is used for transposing, it is usually a Good Thing(TM) to avoid using it as a varable name.) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] add transformed columns into a matrix
Supposing 'inmatrix' is a matrix with coloumn names 'x1', 'x2' and 'x3'; how about something like model.matrix(~ (x1 + x2 + x3)^2 + log(x1) + log(x2) + log(x3) + sqrt(x1) + sqrt(x2) + sqrt(x3) - 1, as.data.frame(inmatrix)) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] mvr function
McGehee, Robert writes: dataSet - data.frame(y = vol[, 12]) dataSet$X - data.matrix(vol[, 1:11]) ans.pcr - pcr(y ~ X, 6, data = dataSet, validation = CV) If there's a more elegant way of doing this without using data frames of matrices, I'd be interested as well. I actually find using data frames with matrices the most elegant way. :-) Especially if you have several matrices. Alternatively, to regress one variable of a data frame on the rest of the variables, one can use ans.pcr - pcr(y ~ ., 6, data = vol, validation = CV) (assuming the response variable is called `y' in the data frame; see names(vol).) One does not _have_ to store the data in a data frame (although I would recommend it, because it is then easier to specify test data sets and alternative data sets). One can simply store the variables in the global environment, and skip the `data' argument of `pcr', -- HTH, Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dot in formula
Adrian Baddeley writes: I want to manipulate a formula object, containing the name . so that . is replaced by a desired (arbitrary) expression. How about myf - y ~ . update(myf, . ~ -. + X) -- HTH, Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [R-pkgs] pls version 1.0-3
Version 1.0-3 of the pls package is now available on CRAN. The pls package implements partial least squares regression (PLSR) and principal component regression (PCR). Features of the package include - Several plsr algorithms: orthogonal scores, kernel pls and simpls - Flexible cross-validation - A formula interface, with traditional methods like predict, coef, plot and summary - Functions for extraction of scores and loadings, and calculation of (R)MSEP and R2 - A simple multiplicative scatter correction (msc) implementation - Functions for plotting predictions, validation statistics, coefficients, scores, loadings and biplots. (The pls package is meant to supersede the pls.pcr package.) -- Ron Wehrens and Bjørn-Helge Mevik ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to break an axis?
What about simply using a log scale on the y axis? I.e. plot(..., log=y) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Can't reproduce clusplot princomp results.
Thomas M. Parris writes: clusplot reports that the first two principal components explain 99.7% of the variability. [...] loadings(pca) [...] Comp.1 Comp.2 Comp.3 Comp.4 SS loadings 1.00 1.00 1.00 1.00 Proportion Var 0.25 0.25 0.25 0.25 Cumulative Var 0.25 0.50 0.75 1.00 This has nothing to do with how much of the variability of the original data that is captured by each component; it merely measures the variability in the coefficients of the loading vectors (and they are standardised to length one in princomp) What you want to look at is pca$sdev, for instance something like totvar - sum(pca$sdev^2) rbind(explained var = pca$sdev^2, prop. expl. var = pca$sdev^2/totvar, cum.prop.expl.var = cumsum(pca$sdev^2)/totvar) Comp.1Comp.2 Comp.3 Comp.4 explained var 3.4093746 0.5785399 0.011560142 0.0005252824 prop. expl. var 0.8523437 0.1446350 0.002890036 0.0001313206 cum.prop.expl.var 0.8523437 0.9969786 0.999868679 1.00 And as you can see, two comps explain 99.7%. :-) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Pca loading plot lables
One way is to use the loadingplot() function in the package `pls': molprop.pc - princomp(whatever) library(pls) loadingplot(molprop.pc, scatter = TRUE, labels = names) (If you want comp 2 vs. comp 1: loadingplot(molprop.pc, comps = 2:1, scatter = TRUE, labels = names) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Thanks! (Was: Re: [R] R-2.1.0 is released)
I'd like to thank the developers in the Core Team for their great work! R has become an invaluable and indispensible tool for (at least) me, much thanks to the hard and good work of the Core Team. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Journal Quality R Graphs?
Werner Wernersen writes: the graphs look nice on the screen but when printed in black and white every color apart from black doesn't look very nice. My advice is: If you want a black-and-white or grayscale printout, don't plot in colors. -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to set up number of prin comp.
I am trying to use PrinComp to do principle component analysis. I would like to know how to set the number of principle components. I assume you mean the function princomp (case _does_ matter in R) in package stats (which is loaded by default). This function has no way of specifying how many components to calculate; it always gives you all components. You have to select the components you want afterwards. See help(princomp) for details. E.g. X - some matrix pc - princomp(X) pc$scores[,1:4]# The four first score vectors pc$loadings[,1:4] # The four first loadings (The loadings can also be extracted with loadings(pc)[,1:4] .) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] using 'nice' with R
Roger D. Peng writes: On a Unix like system you can do `nice +19 R' or perhaps `nice +19 R CMD BATCH commands.R'. At least on Suse (9.1) and Debian (3.0) Linux, the syntax is `nice -19 R' (i.e. with `-', not `+'.) -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Setting the width and height of a Sweave figure
I haven't found any other solution than using fig=TRUE,height=7,width=14 theCode() @ (but of course that doesn't have any effect when theCode() is used interactively). -- Bjørn-Helge Mevik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] http://bugs.r-project.org down?
I haven't been able to connect to http://bugs.r-project.org the last few days. Is there a problem with the site (or am I having a problem :-) ? -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Combined variable names
Peter Dalgaard writes: There are irregularities, e.g. the fact that you do help(foo), not help(foo), but they tend to get a pain in the long run (How do you get help on a name contained in a variable? v - lm; help(v) works for me :-) (But I totally agree that the regularity of R (or S) is part of what makes it so much better than for instance Matlab. At least for me.) -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] LDA with previous PCA for dimensionality reduction
Torsten Hothorn writes: as long as one does not use the information in the response (the class variable, in this case) I don't think that one ends up with an optimistically biased estimate of the error I would be a little careful, though. The left-out sample in the LDA-cross-validation, will still have influenced the PCA used to build the LDA on the rest of the samples. The sample will have a tendency to lie closer to the centre of the complete PCA than of a PCA on the remaining samples. Also, if the sample has a high leverage on the PCA, the directions of the two PCAs can be quite different. Thus, the LDA is built on data that fits better to the left-out sample than if the sample was a completely new sample. I have no proofs or numerical studies showing that this gives over-optimistic error rates, but I would not recommend placing the PCA outside the cross-validation. (The same for any resampling-based validation.) -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Recovering R Workspace
You probably don't need to re-install R; just remove or rename the file .RData (it is probably located in your home directory (or My Documents on MSWin). Then R should start without problems. As for recovering the workspace, I believ that is a lost cause (unless you study the file format and use a binary editor to extract/repair objects in the file -- and even then, if the file was compressed, you may be out of luck). -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R-2.0: roadmap? release statements? plans?
Well, you could download the latest beta-release and look in the NEWS file there. -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Sparse Matrices in R
Wolski [EMAIL PROTECTED] writes: Hi! help.search(sparse matrix) graph2SparseM(graph)Coercion methods between graphs and sparse matrices tripletMatrix-class(Matrix) Class tripletMatrix sparse matrices in triplet form SparseM.hb(SparseM) Harwell-Boeing Format Sparse Matrices image,matrix.csr-method(SparseM) Image Plot for Sparse Matrices etc . Which of course assumes that you already have packages such as SparseM, Matrix and graph installed on your system. If you don't, help.search(sparse matrix) returns no matches. :-) -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Precision in R
Since you didn't say anything about _what_ you did, either in SAS or R, my first thought was: Have you checked that you use the same parametrization of the models in R and SAS? -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] center or scale before analyzing using pls.pcr
Jinsong Zhao [EMAIL PROTECTED] writes: I found pls.pcr package will give different results if the data are centered and scaled using scale(). Centering is done automatically by all implementations of PLSR I am aware of (including pls.pcr, afaics). I am not sure about when I should scale my data, There are no fixed rules about this. Many practitioners live more or less by the rule that unless the variables are `of the same type' or have equal or comparable scales, they are scaled. One example of data that is typically not scaled (at least to begin with) is spectroscopic data. and whether the dependent variable should be scaled. There is no need for scaling the dependent variable. -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Stepwise Regression and PLS
Liaw, Andy [EMAIL PROTECTED] writes: one needs to be lucky to have the first few PCs correlate well to the response in case of PCR. Which is one reason PLSR is often preferred over PCR in at least the field of chemometrics. Since the components of PLSR maximise the covariance with the response, the first few components are usually more correlated to the response than PCs. For spectroscopists, the PLSR loadings are often very interpretable, and are much used to qualitatively validate the model. -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] MATLAB to R
[EMAIL PROTECTED] writes: In MATLAB, I can write: for J=1:M Y(J+1)=Y(J)+ h * feval(f,T(J),Y(J)); ... In R, I can write above as: for (J in 2:M) { y = y + h * f(t,y) ... } Are you sure this gives the same result? If Y and T in Matlab are vectors, I believe for (J in 1:M) { y[J+1] - y[J] + h * f(tt[J], y[J]) ... } is what you want. (Don't use `t' as a variable; t() is the function to transpose a matrix.) for J=1:M k1 = feval(f,T(J),Y(J)); k2 = feval(f,T(J+1),Y(J)+ h * k1 I assume you mean k1(J) = ... and k2(J) = ... How do I write k2 in R? k1 = f(t,y) k2 = ? ## If f can take vector arguments: k1 - f(tt[-M],y) k2 - f(tt[-1], y+h*k1) ## Otherwise: for (J in 1:M) { k1[J] - f(tt[J], y[J]) k2[J] - f(tt[J+1], y[J] + h*k1[J]) } -- Hth, Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nameless functions in R
Rajarshi Guha [EMAIL PROTECTED] writes: apply(x, c(2), funtion(v1,v2){ identical(v1,v2) }, v2=c(1,4,2)) The above gives me a syntax error. I also tried: No wonder! Try with `function' instead of `funtion'. -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] /usr/lib/R/library vs /usr/local/lib/R/site-library
Dirk Eddelbuettel [EMAIL PROTECTED] writes: On Wed, Oct 29, 2003 at 07:51:16AM -0800, A.J. Rossini wrote: /usr/lib/R/site-library is for apt-installed R packages (from CRAN, or Jim Lindsey's works), and It is not yet fully implemented as not all apt-get'able Debian packages of CRAN, Omegahat, ... Forgive my ignorance; I just switched to Debian. Are there R packages (such as `car' or `vegan') that are apt-get'able? How can I find out which ones, and how do I set up apt to get them? (I've added `deb http://cran.r-project.org/bin/linux/debian woody main' to /etc/apt/sources.list, to get the latest version of R.) -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] disappointed (card model)
Don't be disappointed, be glad: It gives you the opportunity to contribute by writing one yourself! (Remember, R is developed by volunteers.) -- Sincerely, Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] princomp with more coloumns than rows: why not?
Thanks for good suggestions for alternatives to princomp! My original question, though, was /why/ it was decided to disallow more coloumns than rows in princomp. (And also whether it would be possible to augment the result from prcomp with the coloumn means.) -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] How to flip image?
Ernie Adorio [EMAIL PROTECTED] writes: If not possible, is there any built-in R command to reverse the rows of a matrix? How about Face[nrow(Face):1, ] ? -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] princomp with more coloumns than rows: why not?
As of R 1.7.0, princomp no longer accept matrices with more coloumns than rows. I'm curious: Why was this decision made? I work a lot with data where more coloumns than rows is more of a rule than an exception (for instance spectroscopic data). To me, princomp have two advantages above prcomp: 1) It has a predict method, and 2) it has a biplot method. A biplot method shouldn't be too difficult to implement (I believe I've seen one on R-help). A predict method seems to be more difficult, because the prcomp object doesn't include the means that need to be subtracted from the new data. Would it break conformance with S to let prcomp return the means as well? -- Sincerely, Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] anova(lme object)
It is documented in ?anova.lme: anova(res1, type=marginal) and anova(res2, type=marginal) should give equivalent tables. -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Marginal (type II) SS for powers of continuous variables ina linear model?
Prof Brian Ripley [EMAIL PROTECTED] writes: drop1 is the part of R that does type II sum of squares, and it works in your example. So does Anova in the current car: I'm sorry, I should have included an example to clarify what I meant (or point out my misunderstandings :-). I'll do that below, but first a comment: And in summary.aov() those *are* marginal SS, as balance is assumed for aov models. (That is not to say the software does not work otherwise, but the interpretability depends on balance.) Maybe I've misunderstood, but in the documentation for aov, it says (under Details): This provides a wrapper to `lm' for fitting linear models to balanced or unbalanced experimental designs. Also, is this example (lm(y~x+I(x^2), Df)) really balanced? I think of balance as the property that there is an equal number of observations for every combination of the factors. With x and x^2, this doesn't happen. For instance, x=1 and x^2=1 occurs once, but x=1 and x^2=4 never occurs (naturally). Or have I misunderstood something? Now, the example: Df2 - expand.grid (A=factor(1:2), B=factor(1:2), x=1:5) Df2$y - codes(Df2$A) + 2*codes(Df2$B) + 0.05*codes(Df2$A)*codes(Df2$B) + + Df2$x + 0.1*Df2$x^2 + 0.1*(0:4) Df2 - Df2[-1,]# Remove one observation to make it unbalanced ABx2.lm - lm(y~A*B + x + I(x^2), data=Df2) The SSs I call marginal are R(A | B, x, x^2), R(B | A, x, x^2), R(A:B | A, B, x, x^2), R(x | A, B, A:B) and R(x^2 | A, B, A:B, x). (Here, for instance, R(x | A, B, A:B) means the reduction of SSE due to including x in a model when A, B and A:B (and the mean) are already in the model. I've omitted the mean from the notation.) anova(ABx2.lm) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F valuePr(F) A 1 1.737 1.737 66.5700 1.801e-06 *** B 1 13.647 13.647 523.0292 6.953e-12 *** x 1 93.677 93.677 3590.1703 2.2e-16 *** I(x^2) 1 0.583 0.583 22.3302 0.0003966 *** A:B1 0.011 0.0110.4238 0.5263772 Residuals 13 0.339 0.026 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 This gives SSs on the form R(A), R(B | A), R(x | A, B) etc. (If the design had been balanced (in A, B and x), this would have been the same as the marginal SSs above.) drop1(ABx2.lm) Single term deletions Model: y ~ A * B + x + I(x^2) Df Sum of Sq RSS AIC none0.339 -64.486 x 1 1.188 1.527 -37.901 I(x^2) 1 0.592 0.931 -47.294 A:B 1 0.011 0.350 -65.877 This gives the SSs R(x | A, B, A:B, x^2), R(x^2 | A, B, A:B, x) and R(A:B | A, B, x, x^2). The SS for x is not marginal as defined above. library (car) Anova(ABx2.lm) Anova Table (Type II tests) Response: y Sum Sq Df F valuePr(F) A 5.1806 1 198.5470 2.979e-09 *** B 19.6610 1 753.5074 6.778e-13 *** x 1.1879 1 45.5245 1.368e-05 *** I(x^2) 0.5922 1 22.6970 0.0003699 *** A:B0.0111 1 0.4238 0.5263772 Residuals 0.3392 13 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 This gives marginal SSs for A, B, x^2 and A:B, but as with drop1, the SS for x is R(x | A, B, A:B, x^2). The only way I've figured out to give the `correct' SS for x, i.e., R(x | A, B, A:B), is: AB.lm - lm(y~A*B, data=Df2) ABx.lm - lm(y~A*B + x, data=Df2) anova (AB.lm, ABx.lm, ABx2.lm) Analysis of Variance Table Model 1: y ~ A * B Model 2: y ~ A * B + x Model 3: y ~ A * B + x + I(x^2) Res.DfRSS Df Sum of SqFPr(F) 1 15 93.760 2 14 0.931 192.829 3557.651 2.2e-16 *** 3 13 0.339 1 0.592 22.697 0.0003699 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 (The ABx2.lm is included to give the same error term to test against as in the ANOVAs above.) The baseline of all this is that I think it would be nice if a function like Anova in the car package returned R(x | A, B, A:B) instead of R(x | A, B, A:B, x^2) as SS for x in a model such as the above. (I hope I've made myself clearer, and not insulted anyone by oversimplifying. :-) -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Marginal (type II) SS for powers of continuous variables in alinear model?
I've used Anova() from the car package to get marginal (aka type II) sum-of-squares and tests for linear models with categorical variables. Is it possible to get marginal SSs also for continuous variables, when the model includes powers of the continuous variables? For instance, if A and B are categorical (factors) and x is continuous (numeric), Anova (lm (y ~ A*B + x, ...)) will produce marginal SSs for all terms (A, B, A:B and x). However, with Anova (lm (y ~ A*B + x + I(x^2), ...)) the SS for 'x' is calculated with I(x^2) present in the model, i.e. it is no longer marginal. Using poly (x, 2) instead of x + I(x^2), one gets a marginal SS for the total effect of x, but not for the linear and quadratic effects separately. (summary.aov() has a 'split' argument that can be used to get separate SSs, but these are not marginal.) -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Marginal (type II) SS for powers of continuous variables ina linear model?
Prof Brian D Ripley [EMAIL PROTECTED] writes: On Tue, 12 Aug 2003, [iso-8859-1] Bjørn-Helge Mevik wrote: Also, is this example (lm(y~x+I(x^2), Df)) really balanced? I think No, and I did not use summary,aov on it! And I didn't say you did! This gives the SSs R(x | A, B, A:B, x^2), R(x^2 | A, B, A:B, x) and R(A:B | A, B, x, x^2). The SS for x is not marginal as defined above. But that *is* how `marginal' is usually defined. Ok. Why should I(x^2) be regarded as subservient to x? In polynomial regression, it is usual to first consider a linear model, then a quadratic, and so forth. The interesting tests are usually then the effect of a power of x whith all lower degree terms of x in the model. I thought it would be natural to treat polynomials of continuous variables similarly in models with categorical variables as well. -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Generating .R and .Rd files with Sweave/noweb?
Paul You're right. My primary goal was to write all the code and documentation in one file, and split this into one .R file and multiple .Rd files. I got the idea of using Sweave/noweb because I'm using Emacs with ESS, and I'd like to be in R-mode when I'm in a code part of the file, and in Rd-mode in a documentation part. I guess using two files and a shell script, as you do, might be the best solution. -- Bjørn-Helge Mevik __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help